MANNK.022C1 PATENT 
EXPRESSION VECTORS ENCODING EPITOPES OF 
TARGET-ASSOCIATED ANTIGENS AND METHODS FOR THEIR DESIGN 

Cross Reference to Related Applications 
[0001] This application is a continuation of U.S. Application No. 10/292,413, 
filed on November 7, 2002, entitled "EXPRESSION VECTORS ENCODING EPITOPES 
OF TARGET-ASSOCIATED ANTIGENS AND METHODS FOR THEIR DESIGN, which 
claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Application No. 60/336,968, 
filed on November 7, 2001, having the same title; both of which are hereby incorporated by 
reference in their entirety. 

Background of the Invention 

Field of the Invention 

[0002] The invention disclosed herein is directed to methods for the design of 
epitope-encbding vectors, and epitope cluster regions, for use in compositions, including for 
example, pharmaceutical compositions capable of inducing an immune response in a subject 
to whom the compositions are administered. The invention is further directed to the vectors 
themselves. The epitope(s) expressed using such vectors can stimulate a cellular immune 
response against a target cell displaying the epitope(s). 

Description of the Related Art 

[0003] The immune system can be categorized into two discrete effector arms. 

The first is innate immunity, which involves numerous cellular components and soluble 

« 

factors that respond to all infectious challenges. The other is the adaptive immune response, 
which is customized to respond specifically to precise epitopes from infectious agents. The 
adaptive immune response is further broken down into two effector arms known as the 
humoral and cellular immune systems. The humoral arm is centered on the production of 
antibodies by B-lymphocytes while the cellular arm involves the killer cell activity of 
cytotoxic T Lymphocytes. 
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[0004] Cytotoxic T Lymphocytes (CTL) do not recognize epitopes on the 
infectious agents themselves. Rather, CTL detect fragments of antigens derived from 
infectious agents that are displayed on the surface of infected cells. As a result antigens are 
visible to CTL only after they have been processed by the infected cell and thus displayed on 
the surface of the cell. 

[0005] The antigen processing and display system on the surface of cells has been 
well established. CTL recognize short peptide antigens, which are displayed on the surface in 
non-covalent association with class I major histocompatibility complex molecules (MHC). 
These class I peptides are in turn derived from the degradation of cytosolic proteins. 

Summary of the Invention 

[0006] The invention disclosed herein relates to the identification of epitope 
cluster regions that are used to generate pharmaceutical compositions capable of inducing an 
immune response from a subject to whom the compositions have been administered. One 
embodiment of the disclosed invention relates to an epitope cluster, the cluster being derived 
from an antigen associated with a target, the cluster including or encoding at least two 
sequences having a known or predicted affinity for an MHC receptor peptide binding cleft, 
wherein the cluster is an incomplete fragment of the antigen. 

[0007] In one aspect of the invention, the target is a neoplastic cell. 

[0008] In another aspect of the invention, the MHC receptor may be a class I HLA 
receptor. 

[0009] In yet another aspect of the invention, the cluster includes or encodes a 
polypeptide having a length, wherein the length is at least 10 amino acids. Advantageously, 
the length of the polypeptide may be less than about 75 amino acids. 

[0010] In still another aspect of the invention, there is provided an antigen having 
a length, wherein the cluster consists of or encodes a polypeptide having a length, wherein 
the length of the polypeptide is less than about 80% of the length of the antigen. Preferably, 
the length of the polypeptide is less than about 50% of the length of the antigen. Most 
preferably, the length of the polypeptide is less than about 20% of the length of the antigen. 



[0011] Embodiments of the invention particularly relate to epitope clusters 
identified in the tumor-associated antigen SSX-2 (SEQ ED NO: 40). One embodiment of the 
invention relates to an isolated nucleic acid containing a reading frame with a first sequence 
encoding one or more segments of SSX-2, wherein the whole antigen is not encoded, wherein 
each segment contains an epitope cluster, and wherein each cluster contains at least two 
amino acid sequences with a known or predicted affinity for a same MHC receptor peptide 
binding cleft. In various aspects of the invention the epitope cluster can be amino acids 5-28, 
16-28, 41-65, 57-67, 99-114, 167-180, and 167-183 of SSX-2. In other aspects the segments 
can consist of an epitope cluster; the first sequence can be a fragment of SSX-2; the fragment 
can consists of a polypeptide having a length, wherein the length of the polypeptide is less 
than about 90, 80, 60, 50, 25, or 10% of the length of SSX-2; the fragment can consist 
essentially of an amino acid sequence beginning at amino acid 5, 16, 41, 57, or 99 and ending 
at amino acid 65, 67, 114, 180, or 183 of SSX-2; or the fragment consists of amino acids 15- 
183 of SSX-2. Further embodiments of the invention include a second sequence encoding 
essentially a housekeeping epitope. In one aspect of this embodiment the first and second 
sequences constitute a single reading frame. In aspects of the invention the reading frame is 
operably linked to a promoter. Other embodiments of the invention include the polypeptides 
encoded by the nucleic acid embodiments of the invention and immunogenic compositions 
containing the nucleic acids or polypeptides of the invention. 

[0012] Embodiments of the invention provide expression cassettes, for example, 
for use in vaccine vectors, which encode one or more embedded housekeeping epitopes, and 
methods for designing and testing such expression cassettes. Housekeeping epitopes can be 
liberated from the translation product of such cassettes through proteolytic processing by the 
immunoproteasome of professional antigen presenting cells (pAPC). In one embodiment of 
the invention, sequences flanking the housekeeping epitope(s) can be altered to promote 
cleavage by the immunoproteasome at the desired location(s). Housekeeping epitopes, their 
uses, and identification are described in U.S. Patent Application Nos. 09/560,465 and 
09/561,074 entitled EPITOPE SYNCHRONIZATION IN ANTIGEN PRESENTING 
CELLS, and METHOD OF EPITOPE DISCOVERY, respectively; both of which were filed 
on April 28, 2000, and which are both incorporated herein by reference in their entireties. 
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[0013] Examples of housekeeping epitopes are disclosed in provisional U.S. 
Patent Applications entitled EPITOPE SEQUENCES, Nos. 60/282,211, filed on April 6, 
2001; 60/337,017, filed on November 7, 2001; 60/363210 filed 3/7/02; and 60/409,123, filed 
on September 5, 2002; and U.S. Application No. 10/117,937, filed on April 4, 2002, which is 
also entitled EPITOPE SEQUENCES; which are all incorporated herein by reference in their 
entirety. 

[0014] In other embodiments of the invention, the housekeeping epitope(s) can be 
flanked by arbitrary sequences or by sequences incorporating residues known to be favored in 
immunoproteasome cleavage sites. As used herein the term "arbitrary sequences" refers to 
sequences chosen without reference to the native sequence context of the epitope, their ability 
to promote processing, or immunological function. In further embodiments of the invention 
multiple epitopes can be arrayed head-to-tail. These arrays can be made up entirely of 
housekeeping epitopes. Likewise, the arrays can include alternating housekeeping and 
immune epitopes. Alternatively, the arrays can include housekeeping epitopes flanked by 
immune epitopes, whether complete or distally truncated. Further, the arrays can be of any 
other similar arrangement. There is no restriction on placing a housekeeping epitope at the 
terminal positions of the array. The vectors can additionally contain authentic protein coding 
sequences or segments thereof containing epitope clusters as a source of immune epitopes. 
The term "authentic" refers to natural protein sequences. 

[0015] Epitope clusters and their uses are described in U.S. Patent application 
Nos. 09/561,571 entitled EPITOPE CLUSTERS, filed on April 28, 2000; 10/005,905, 
entitled EPITOPE SYNCHRONIZATION IN ANTIGEN PRESENTING CELLS, filed on 
November 7, 2001; and 10/026,066, filed on December 7, 2001, also entitled EPITOPE 
SYNCHRONIZATION IN ANTIGEN PRESENTING CELLS; all of which are incorporated 
herein by reference in their entirety. 

[0016] Embodiments of the invention can encompass screening the constructs to 
determine whether the housekeeping epitope is liberated. In constructs containing multiple 
housekeeping epitopes, embodiments can include screening to determine which epitopes are 
liberated. In a preferred embodiment, a vector containing an embedded epitope can be used 
to immunize HLA transgenic mice and the resultant CTL can be tested for their ability to 
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recognize target cells presenting the mature epitope. In another embodiment, target cells 
expressing immunoproteasome can be transformed with the vector. The target cell may 
express immunoproteasome either constitutively, because of treatment with interferon (IFN), 
or through genetic manipulation, for example. CTL that recognize the mature epitope can be 
tested for their ability to recognize these target cells. In yet another embodiment, the 
embedded epitope can be prepared as a synthetic peptide. The synthetic peptide then can be 
subjected to digestion by an immunoproteasome preparation in vitro and the resultant 
fragments can be analyzed* to determine the sites of cleavage. Such polypeptides, 
recombinant or synthetic, from which embedded epitopes can be successfully liberated, can 
also be incorporated into immunogenic compositions. 

[0017] The invention disclosed herein relates to the identification of a polypeptide 
suitable for epitope liberation. One embodiment of the invention, relates to a method of 
identifying a polypeptide suitable for epitope liberation including, for example, the steps of 
identifying an epitope of interest; providing a substrate polypeptide sequence including the 
epitope, wherein the substrate polypeptide permits processing by a proteasome; contacting 
the substrate polypeptide with a composition including the proteasome, under conditions that 
support processing of the substrate polypeptide by the proteasome; and assaying for liberation 
of the epitope. 

[0018] The epitope can be embedded in the substrate polypeptide, and in some 
aspects the substrate polypeptide can include more than one epitope, for example. Also, the 
epitope can be a housekeeping epitope. 

[0019] In one aspect, the substrate polypeptide can be a synthetic peptide. 
Optionally, the substrate polypeptide can be included in a formulation promoting protein 
transfer. Alternatively, the substrate polypeptide can be a fusion protein. The fusion protein 
can further include a protein domain possessing protein transfer activity. Further, the 
contacting step can include immunization with the substrate polypeptide. 

[0020] In another aspect, the substrate polypeptide can be encoded by a 
polynucleotide. The contacting step can include immunization with a vector including the 
polynucleotide, for example. The immunization can be carried out in an HLA-transgenic 
mouse or any other suitable animal, for example. Alternatively, the contacting step can 
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include transforming a cell with a vector including the polynucleotide. In some embodiments 
the transformed cell can be a target cell that is targeted by CTL for purposes of assaying for 
proper liberation of epitope. 

[0021] The proteasome processing can take place intracellular^, either in vitro or 
in vivo. Further, the proteasome processing can take place in a cell-free system. 

[0022] The assaying step can include a technique selected from the group 
including, but not limited to, mass spectrometry, N-terminal pool sequencing, HPLC, and the 
like. Also, the assaying step can include a T cell target recognition assay. The T cell target 
recognition assay can be selected from the group including, but not limited to, a cytolytic 
activity assay, a chromium release assay, a cytokine assay, an ELISPOT assay, tetramer 
analysis, and the like. 

[0023] In still another aspect, the amino acid sequence of the substrate 
polypeptide including the epitope can be arbitrary. Also, the substrate polypeptide in which 
the epitope is embedded can be derived from an authentic sequence of a target-associated 
antigen. Further, the substrate polypeptide in which the epitope is embedded can be 
conformed to a preferred immune proteasome cleavage site flanking sequence. 

[0024] In another aspect, the substrate polypeptide can include an array of 
additional epitopes. Members of the array can be arranged head-to-tail, for example. The 
array can include more than one housekeeping epitope. The more than one housekeeping 
epitope can include copies of the same epitope. The array can include a housekeeping and an 
immune epitope, or alternating housekeeping and immune epitopes, for example. Also, the 
array can include a housekeeping epitope positioned between two immune epitopes in an 
epitope battery. The array can include multiple epitope batteries, so that there are two 
immune epitopes between each housekeeping epitope in the interior of the array. Optionally, 
at least one of the epitopes can be truncated distally to its junction with an adjacent epitope. 
The truncated epitopes can be immune epitopes, for example. The truncated epitopes can 
have lengths selected from the group including, but not limited to, 9, 8, 7, 6, 5, 4 amino acids, 
and the like. 



-6- 



[0025] In still another aspect, the substrate polypeptide can include an array of 
epitopes and epitope clusters. Members of the array can be arranged head-to-tail, for 
example. 

[0026] In yet another aspect, the proteasome can be an immune proteasome. 

[0027] Another embodiment of the disclosed invention relates to vectors 
including a housekeeping epitope expression cassette. The housekeeping epitope(s) can be 
derived from a target-associated antigen, and the housekeeping epitope can be liberatable, 
that is capable of liberation, from a translation product of the cassette by immunoproteasome 
processing. 

[0028] In one aspect of the invention the expression cassette can encode an array 
of two or more epitopes or at least one epitope and at least one epitope cluster. The members 
of the array can be arranged head-to-tail, for example. Also, the members of the array can be 
arranged head-to-tail separated by spacing sequences, for example. Further, the array can 
include a plurality of housekeeping epitopes. The plurality of housekeeping epitopes can 
include more than one copy of the same epitope or single copies of distinct epitopes, for 
example. The array can include at least one housekeeping epitope and at least one immune 
epitope. Also, the array can include alternating housekeeping and immune epitopes. Further, 
the array includes a housekeeping epitope sandwiched between two immune epitopes so that 
there are two immune epitopes between each housekeeping epitope in the interior of the 
array. The immune epitopes can be truncated distally to their junction with the adjacent 
housekeeping epitope. 

[0029] In another aspect, the expression cassette further encodes an authentic 
protein sequence, or segment thereof, including at least one immune epitope. Optionally, the 
segment can include at least one epitope cluster. The housekeeping epitope expression 
cassette and the authentic sequence including at least one immune epitope can be encoded in 
a single reading frame or transcribed as a single mRNA species, for example. Also, the 
housekeeping epitope expression cassette and the authentic sequence including at least one 
immune epitope may not be transcribed as a single mRNA species. 

[0030] In yet another aspect, the vector can include a DNA molecule or an RNA 
molecule. The vector can encode, for example, SEQ ID NO. 4, SEQ ID NO. 17, SEQ ID 
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NO. 20, SEQ ID NO. 26, SEQ ID NO. 27, SEQ ID NO. 29, SEQ ID NO. 33, and the like. 
Also, the vector can include SEQ ID NO. 9, SEQ ID NO. 19, SEQ ID NO. 21, SEQ ID NO. 
30, SEQ ID NO. 34, and the like. Also, the vector can encode SEQ ID NO. 5 or SEQ ID NO. 
18, for example. 

[0031] In still another aspect, the target-associated antigen can be an antigen 
derived from or associated with a tumor or an intracellular parasite, and the intracellular 
parasite can be, for example, a virus, a bacterium, a protozoan, or the like. 

[0032] Another embodiment of the invention relates to vectors including a 
housekeeping epitope identified according to any of the methods disclosed herein, claimed or 
otherwise. For example, embodiments can relate to vector encoding a substrate polypeptide 
that includes a housekeeping epitope by any of the methods described herein. 

[0033] In one aspect, the housekeeping epitope can be liberated from the cassette 
translation product by immune proteasome processing 

[0034] Another embodiment of the disclosed invention relates to methods of 
activating a T cell. The methods can include, for example, the steps of contacting a vector 
including a housekeeping epitope expression cassette with an APC. The housekeeping 
epitope can be derived from a target-associated antigen, for example, and the housekeeping 
epitope can be liberatable from a translation product of the cassette by immunoproteasome 
processing. The methods can further include contacting the APC with a T cell. The 
contacting of the vector with the APC can occur in vitro or in vivo. 

[0035] Another embodiment of the disclosed invention relates to a substrate 
polypeptide including a housekeeping epitope wherein the housekeeping epitope can be 
liberated by immunoproteasome processing in a pAPC. 

[0036] Another embodiment of the disclosed invention relates to a method of 
activating a T cell comprising contacting a substrate polypeptide including a housekeeping 
epitope with an APC wherein the housekeeping epitope can be liberated by 
immunoproteasome processing and contacting the APC with a T cell. 



-8- 



Brief Description of the Drawings 

[0037] Figure 1 depicts the sequence of Melan-A (SEQ ID NO: 2), showing 
clustering of class I HLA epitopes. 

[0038] Figure 2 depicts the sequence of SSX-2 (SEQ ID NO: 40), showing 
clustering of class I HLA epitopes. 

[0039] Figure 3 depicts the sequence of NY-ESO (SEQ ED NO: 11), showing 
clustering of class I HLA epitopes. 

[0040] Figure 4. An illustrative drawing depicting pMA2M. 

[0041] Figure 5. Assay results showing the % of specific lysis of ELAGIGILTV 
pulsed and unpulsed T2 target cells by mock immunized CTL. 

[0042] Figure 6. Assay results showing the % of specific lysis of ELAGIGILTV 
pulsed and unpulsed T2 target cells by pVAXM3 immunized CTL. 

[0043] Figure 7. Assay results showing the % of specific lysis of ELAGIGILTV 
pulsed and unpulsed T2 target cells by pVAXM2 immunized CTL. 

[0044] Figure 8. Assay results showing the % of specific lysis of ELAGIGILTV 
pulsed and unpulsed T2 target cells by pVAXMl immunized CTL. 

[0045] Figure 9. Illustrates a sequence of SEQ ID NO. 22 from which the NY- 
ESO- 1 157-165 epitope is liberated by immunoproteasomal processing. 

[0046] Figure 10. Shows the differential processing by immunoproteasome and 
housekeeping proteasome of the SLLMWITQC epitope (SEQ ID NO. 12) in its native 
context where the cleavage following the C is more efficiently produced by housekeeping 
than immunoproteasome. 

[0047] Figure 11. 8A: Shows the results of the human immunoproteasome digest 
of SEQ ID NO. 31. 8B: Shows the comparative results of mouse versus human 
immunoproteasome digestion of SEQ ID NO. 31. 

[0048] Figure 12. Shows the differential processing of SSX-2 3 i_ 6 8 by 
housekeeping and immunoproteasome. 
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Detailed Description of the Preferred Embodiment 

Definitions 

[0049] Unless otherwise clear from the context of the use of a term herein, the 
following listed terms shall generally have the indicated meanings for purposes of this 
description. 

[0050] PROFESSIONAL ANTIGEN-PRESENTING CELL (pAPC) - a cell that 
possesses T cell costimulatory molecules and is able to induce a T cell response. Well 
characterized pAPCs include dendritic cells, B cells, and macrophages. 

[0051] PERIPHERAL CELL - a cell that is not a pAPC. 

[0052] HOUSEKEEPING PROTEASOME - a proteasome normally active in 
peripheral cells, and generally not present or not strongly active in pAPCs. 

[0053] IMMUNOPROTEASOME - a proteasome normally active in pAPCs; the 
immunoproteasome is also active in some peripheral cells in infected tissues or following 
exposure to interferon. 

[0054] EPITOPE - a molecule or substance capable of stimulating an immune 
response. In preferred embodiments, epitopes according to this definition include but are not 
necessarily limited to a polypeptide and a nucleic acid encoding a polypeptide, wherein the 
polypeptide is capable of stimulating an immune response. In other preferred embodiments, 
epitopes according to this definition include but are not necessarily limited to peptides 
presented on the surface of cells, the peptides being non-covalently bound to the binding cleft 
of class I MHC, such that they can interact with T cell receptors (TCR). Epitopes presented 
by class I MHC may be in immature or mature form. "Mature" refers to an MHC epitope in 
distinction to any precursor ("immature") that may include or consist essentially of a 
housekeeping epitope, but also includes other sequences in a primary translation product that 
are removed by processing, including without limitation, alone or in any combination, 
proteasomal digestion, N-terminal trimming, or the action of exogenous enzymatic activities. 
Thus, a mature epitope may be provided embedded in a somewhat longer polypeptide, the 
immunological potential of which is due, at least in part, to the embedded epitope; or in its 
ultimate form that can bind in the MHC binding cleft to be recognized by TCR, respectively. 
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[0055] MHC EPITOPE - a polypeptide having a known or predicted binding 
affinity for a mammalian class I or class II major histocompatibility complex (MHC) 
molecule. 

[0056] HOUSEKEEPING EPITOPE - In a preferred embodiment, a 
housekeeping epitope is defined as a polypeptide fragment that is an MHC epitope, and that 
is displayed on a cell in which housekeeping proteasomes are predominantly active. In 
another preferred embodiment, a housekeeping epitope is defined as a polypeptide containing 
a housekeeping epitope according to the foregoing definition, that is flanked by one to several 
additional amino acids. In another preferred embodiment, a housekeeping epitope is defined 
as a nucleic acid that encodes a housekeeping epitope according to the foregoing definitions. 
Exemplary housekeeping epitopes are provide in U.S. Application No. 10/117,937, filed on 
April 4, 2002; and U.S. Provisional Application Nos. 60/282,211, filed on April 6, 2001; 
60/337,017, filed on November 7, 2001; 60/363210 filed 3/7/02; and 60/409,123, filed on 
September 5, 2002; all of which are entitled EPITOPE SEQUENCES, and all of which above 
were incorporated herein by reference in their entireties. 

[0057] IMMUNE EPITOPE - In a preferred embodiment, an immune epitope is 
defined as a polypeptide fragment that is an MHC epitope, and that is displayed on a cell in 
which immunoproteasomes are predominantly active. In another preferred embodiment, an 
immune epitope is defined as a polypeptide containing an immune epitope according to the 
foregoing definition, that is flanked by one to several additional amino acids. In another 
preferred embodiment, an immune epitope is defined as a polypeptide including an epitope 
cluster sequence, having at least two polypeptide sequences having a known or predicted 
affinity for a class I MHC. In yet another preferred embodiment, an immune epitope is 
defined as a nucleic acid that encodes an immune epitope according to any of the foregoing 
definitions. 

[0058] TARGET CELL - a cell to be targeted by the vaccines and methods of the 
invention. Examples of target cells according to this definition include but are not 
necessarily limited to: a neoplastic cell and a cell harboring an intracellular parasite, such as, 
for example, a virus, a bacterium, or a protozoan. Target cells can also include cells that are 
targeted by CTL as a part of assays to determine or confirm proper epitope liberation and 
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processing by a cell expressing immunoproteasome, to determine T cell specificity or 
immunogenicity for a desired epitope.. Such cells may be transfored to express the substrate 
or liberation sequence, or the cells can simply be pulsed with peptide/epitope. 

[0059] TARGET-ASSOCIATED ANTIGEN (TAA) - a protein or polypeptide 
present in a target cell. 

[0060] TUMOR-ASSOCIATED ANTIGENS (TuAA) - a TAA, wherein the 
target cell is a neoplastic cell. 

[0061] HLA EPITOPE - a polypeptide having a known or predicted binding 
affinity for a human class I or class II HLA complex molecule. 

[0062] ANTIBODY - a natural immunoglobulin (Ig), poly- or monoclonal, or any 
molecule composed in whole or in part of an Ig binding domain, whether derived 
biochemically or by use of recombinant DNA. Examples include inter alia, F(ab), single 
chain Fv, and Ig variable region-phage coat protein fusions. 

[0063] ENCODE - an open-ended term such that a nucleic acid encoding a 
particular amino acid sequence can consist of codons specifying that (polypeptide, but can 
also comprise additional sequences either translatable, or for the control of transcription, 
translation, or replication, or to facilitate manipulation of some host nucleic acid construct. 

[0064] SUBSTANTIAL SIMILARITY - this term is used to refer to sequences 
that differ from a reference sequence in an inconsequential way as judged by examination of 
the sequence. Nucleic acid sequences encoding the same amino acid sequence are 
substantially similar despite differences in degenerate positions or modest differences in 
length or composition of any non-coding regions. Amino acid sequences differing only by 
conservative substitution or minor length variations are substantially similar. Additionally, 
amino acid sequences comprising housekeeping epitopes that differ in the number of N- 
terminal flanking residues, or immune epitopes and epitope clusters that differ in the number 
of flanking residues at either terminus, are substantially similar. Nucleic acids that encode 
substantially similar amino acid sequences are themselves also substantially similar. 

[0065] FUNCTIONAL SIMILARITY - this term is used to refer to sequences 
that differ from a reference sequence in an inconsequential way as judged by examination of 
a biological or biochemical property, although the sequences may not be substantially similar. 
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For example, two nucleic acids can be useful as hybridization probes for the same sequence 
but encode differing amino acid sequences. Two peptides that induce cross-reactive CTL 
responses are functionally similar even if they differ by non-conservative amino acid 
substitutions (and thus do not meet the substantial similarity definition). Pairs of antibodies, 
or TCRs, that recognize the same epitope can be functionally similar to each other despite 
whatever structural differences exist. In testing for functional similarity of immunogenicity 
one would generally immunize with the "altered" antigen and test the ability of the elicited 
response (Ab, CTL, cytokine production, etc.) to recognize the target antigen. Accordingly, 
two sequences may be designed to differ in certain respects while retaining the same 
function. Such designed sequence variants are among the embodiments of the present 
invention. 

[0066] EXPRESSION CASSETTE - a polynucleotide sequence encoding a 
polypeptide, operably linked to a promoter and other transcription and translation control 
elements, including but not limited to enhancers, termination codons, internal ribosome entry 
sites, and polyadenylation sites. The cassette can also include sequences that facilitate 
moving it from one host molecule to another. 

[0067] EMBEDDED EPITOPE - an epitope contained within a longer 
polypeptide, also can include an epitope in which either the N- terminus or the C-terminus is 
embedded such that the epitope is not in an interior position. 

[0068] MATURE EPITOPE - a peptide with no additional sequence beyond that 
present when the epitope is bound in the MHC peptide-binding cleft. 

[0069] EPITOPE CLUSTER - a polypeptide, or a nucleic acid sequence encoding 
it, that is a segment of a native protein sequence comprising two or more known or predicted 
epitopes with binding affinity for a shared MHC restriction element, wherein the density of 
epitopes within the cluster is greater than the density of all known or predicted epitopes with 
binding affinity for the shared MHC restriction element within the complete protein 
sequence, and as disclosed in U.S. Patent Application No. 09/561,571 entitled EPITOPE 
CLUSTERS. 

[0070] SUBSTRATE OR LIBERATION SEQUENCE - a designed or engineered 
sequence comprising or encoding a housekeeping epitope (according to the first of the 
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definitions offered above) embedded in a larger sequence that provides a context allowing the 
housekeeping epitope to be liberated by immunoproteasomal processing, directly or in 
combination with N-terminal trimming or other processes. 

Epitope Clusters 

[0071] Embodiments of the invention disclosed herein provide epitope cluster 
regions (ECRs) for use in vaccines and in vaccine design and epitope discovery. Specifically, 
embodiments of the invention relate to identifying epitope clusters for use in generating 
immunologically active compositions directed against target cell populations, and for use in 
the discovery of discrete housekeeping epitopes and immune epitopes. In many cases, 
numerous putative class I MHC epitopes may exist in a single target-associated antigen 
(TAA). Such putative epitopes are often found in clusters (ECRs), MHC epitopes distributed 
at a relatively high density within certain regions in the amino acid sequence of the parent 
TAA. Since these ECRs include multiple putative epitopes with potential useful biological 
activity in inducing an immune response, they represent an excellent material for in vitro or 
in vivo analysis to identify particularly useful epitopes for vaccine design. And, since the 
epitope clusters can themselves be processed inside a cell to produce active MHC epitopes, 
the clusters can be used directly in vaccines, with one or more putative epitopes in the cluster 
actually being processed into an active MHC epitope. 

[0072] The use of ECRs in vaccines offers important technological advances in 
the manufacture of recombinant vaccines, and further offers crucial advantages in safety over 
existing nucleic acid vaccines that encode whole protein sequences. Recombinant vaccines 
generally rely on expensive and technically challenging production of whole proteins in 
microbial fermentors. ECRs offer the option of using chemically synthesized polypeptides, 
greatly simplifying development and manufacture, and obviating a variety of safety concerns. 
Similarly, the ability to use nucleic acid sequences encoding ECRs, which are typically 
relatively short regions of an entire sequence, allows the use of synthetic oligonucleotide 
chemistry processes in the development and manipulation of nucleic acid based vaccines, 
rather than the more expensive, time consuming, and potentially difficult molecular biology 
procedures involved with using whole gene sequences. 
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[0073] Since an ECR is encoded by a nucleic acid sequence that is relatively short 
compared to that which encodes the whole protein from which the ECR is found, this can 
greatly improve the safety of nucleic acid vaccines. An important issue in the field of nucleic 
acid vaccines is the fact that the extent of sequence homology of the vaccine with sequences 
in the animal to which it is administered determines the probability of integration of the 
vaccine sequence into the genome of the animal. A fundamental safety concern of nucleic 
acid vaccines is their potential to integrate into genomic sequences, which can cause 
deregulation of gene expression and tumor transformation. The Food and Drug 
Administration has advised that nucleic acid and recombinant vaccines should contain as 
little sequence homology with human sequences as possible. In the case of vaccines 
delivering tumor-associated antigens, it is inevitable that the vaccines contain nucleic acid 
sequences that are homologous to those which encode proteins that are expressed in the 
tumor cells of patients. It is, however, highly desirable to limit the extent of those sequences 
to that which is minimally essential to facilitate the expression of epitopes for inducing 
therapeutic immune responses. The use of ECRs thus offers the dual benefit of providing a 
minimal region of homology, while incorporating multiple epitopes that have potential 
therapeutic value. 

[0074] Note that the following discussion sets forth the inventors' understanding 
of the operation of the invention. However, it is not intended that this discussion limit the 
patent to any particular theory of operation not set forth in the claims. 

ECRs are Processed into MHC-Binding Epitopes in pAPCs 

[0075] The immune system constantly surveys the body for the presence of 
foreign antigens, in part through the activity of pAPCs. The pAPCs endocytose matter found 
in the extracellular milieu, process that matter from a polypeptide form into shorter 
oligopeptides of about 3 to 23 amino acids in length, and display some of the resulting 
peptides to T cells via the MHC complex of the pAPCs. For example, a tumor cell upon lysis 
releases its cellular contents, including various proteins, into the extracellular milieu. Those 
released proteins can be endocytosed by pAPCs and processed into discrete peptides that are 
then displayed on the surface of the pAPCs via the MHC. By this mechanism, it is not the 
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entire target protein that is presented on the surface of the pAPCs, but rather only one or more 
discrete fragments of that protein that are presented as MHC-binding epitopes. If a presented 
epitope is recognized by a T cell, that T cell is activated and an immune response results. 

[0076] Similarly, the scavenger receptors on pAPC can take-up naked nucleic 
acid sequences or recombinant organisms containing target nucleic acid sequences. Uptake of 
the nucleic acid sequences into the pAPC subsequently results in the expression of the 
encoded products. As above, when an ECR can be processed into one or more useful 
epitopes, these products can be presented as MHC epitopes for recognition by T cells. 

[0077] MHC-binding epitopes are often distributed unevenly throughout a protein 
sequence in clusters. Embodiments of the invention are directed to identifying epitope 
cluster regions (ECRs) in a particular region of a target protein. Candidate ECRs are likely to 
be natural substrates for various proteolytic enzymes and are likely to be processed into one 
or more epitopes for MHC display on the surface of an pAPC. In contrast to more traditional 
vaccines that deliver whole proteins or biological agents, ECRs can be administered as 
vaccines, resulting in a high probability that at least one epitope will be presented on MHC 
without requiring the use of a full length sequence. 

The Use of ECRs in Identifying Discrete MHC-Binding Epitopes 

[0078] Identifying putative MHC epitopes for use in vaccines often includes the 
use of available predictive algorithms that analyze the sequences of proteins or genes to 
predict binding affinity of peptide fragments for MHC. These algorithms rank putative 
epitopes according to predicted affinity or other characteristics associated with MHC binding. 
Exemplary algorithms for this kind of analysis include the Rammensee and NIH (Parker) 
algorithms. However, identifying epitopes that are naturally present on the surface of cells 
from among putative epitopes predicted using these algorithms has proven to be a difficult 
and laborious process. The use of ECRs in an epitope identification process can enormously 
simplify the task of identifying discrete MHC binding epitopes. 

[0079] In a preferred embodiment, ECR polypeptides are synthesized on an 
automated peptide synthesizer and these ECRs are then subjected to in vitro digests using 
proteolytic enzymes involved in processing proteins for presentation of the epitopes. Mass 
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spectrometry and/or analytical HPLC are then used to identify the digest products and in vitro 
MHC binding studies are used to assess the ability of these products to actually bind to MHC 
Once epitopes contained in ECRs have been shown to bind MHC, they can be incorporated 
into vaccines or used as diagnostics, either as discrete epitopes or in the context of ECRs. 

[0080] The use of an ECR (which because of its relatively short sequence can be 
produced through chemical synthesis) in this preferred embodiment is a significant 
improvement over what otherwise would require the use of whole protein. This is because 
whole proteins have to be produced using recombinant expression vector systems and/or 
complex purification procedures. The simplicity of using chemically synthesized ECRs 
enables the analysis and identification of large numbers of epitopes, while greatly reducing 
the time and expense of the process as compared to other currently used methods. The use of 
a defined ECR also greatly simplifies mass spectrum analysis of the digest, since the products 
of an ECR digest are a small fraction of the digest products of a whole protein. 

[0081] In another embodiment, nucleic acid sequences encoding ECRs are used to 
express the polypeptides in cells or cell lines to assess which epitopes are presented on the 
surface. A variety of means can be used to detect the epitope on the surface. Preferred 
embodiments involve the lysis of the cells and affinity purification of the MHC, and 
subsequent elution and analysis of peptides from the MHC; or elution of epitopes from intact 
cells; (Falk, K. et al. Nature 351:290, 1991, and U.S. Patent 5,989,565, respectively, both of 
which references are incorporated herein by reference in their entirety). A sensitive method 
for analyzing peptides eluted in this way from the MHC employs capillary or nanocapillary 
HPLC ESI mass spectrometry and on-line sequencing. 

Target- Associated Antigens that Contain ECRs 

[0082] TAAs from which ECRs may be defined include those from TuAAs, 
including oncofetal, cancer-testis, deregulated genes, fusion genes from errant translocations, 
differentiation antigens, embryonic antigens, ceil cycle proteins, mutated tumor suppressor 
genes, and overexpressed gene products, including oncogenes. In addition, ECRs may be 
derived from virus gene products, particularly those associated with viruses that cause 
chronic diseases or are oncogenic, such as the herpes viruses, human papilloma viruses, 
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human immunodeficiency virus, and human T cell leukemia virus. Also ECRs may be 
derived from gene products of parasitic organisms, such as Trypanosoma, Leishmania, and 
other intracellular or parasitic organisms. 

[0083] Some of these TuAA include a-fetoprotein, carcinoembryonic antigen 
(CEA), esophageal cancer derived NY-ESO-1, and SSX genes, SCP-1, PRAME, MART- 
1/MelanA (MART-1), gplOO (Pmel 17), tyrosinase, TRP-1, TRP-2, MAGE-1, MAGE-2, 
MAGE-3, BAGE, GAGE-1, GAGE-2, pl5; overexpressed oncogenes and mutated tumor- 
suppressor genes such as p53, Ras, HER-2/neu; unique tumor antigens resulting from 
chromosomal translocations such as BCR-ABL, E2A-PRL, H4-RET, IGH-IGK, MYL-RAR1 
and viral antigens, EBNA1, EBNA2, HPV-E6, -E7; prostate specific antigen (PSA), prostate 
stem cell antigen (PSCA), MAAT-1, GP-100, TSP-180, MAGE-4, MAGE-5, MAGE-6, 
RAGE, pl85erbB-2, pl85erbB-3, c-met, nm-23Hl, TAG-72, CA 19-9, CA 72-4, CAM 17.1, 
NuMa, K-ras, p-Catenin, CDK4, Mum-1, pi 5, and pi 6. 

[0084] Numerous other TAAs are also contemplated for both pathogens and 
tumors. In terms of TuAAs, a variety of methods are available and well known in the art to 
identify genes and gene products that are differentially expressed in neoplastic cells as 
compared to normal cells. Examples of these techniques include differential hybridization, 
including the use of microarrays; subtractive hybridization cloning; differential display, either 
at the level of mRNA or protein expression; EST sequencing; and SAGE (sequential analysis 
of gene expression). These nucleic acid techniques have been reviewed by Carulli, J.P. et al., 
J. Cellular Biochem Suppl. 30/31:286-296, 1998 (hereby incorporated by reference). 
Differential display of proteins involves, for example, comparison of two-dimensional poly- 
acrylamide gel electrophoresis of cell lysates from tumor and normal tissue, location of 
protein spots unique or overexpressed in the tumor, recovery of the protein from the gel, and 
identification of the protein using traditional biochemical- or mass spectrometry-based 
sequencing. An additional technique for identification of TAAs is the Serex technique, 
discussed in Tureci, O., Sahin, U., and Pfreundschuh, M, "Serological analysis of human 
tumor antigens: molecular definition and implications", Molecular Medicine Today, 3:342, 
1997, and hereby incorporated by reference. 
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[0085] Use of these and other methods provides one of skill in the art the 
techniques necessary to identify genes and gene products contained within a target cell that 
may be used as potential candidate proteins for generating the epitopes of the invention 
disclosed. However, it is not necessary, in practicing the invention, to identify a novel TuAA 
or TAA. Rather, embodiments of the invention make it possible to identify ECRs from any 
relevant protein sequence, whether the sequence is already known or is new. 

Protein Sequence Analysis to Identify Epitope Clusters 

[0086] In preferred embodiments of the invention, identification of ECRs 
involves two main steps: (1) identifying good putative epitopes; and (2) defining the limits 
of any clusters in which these putative epitopes are located. There are various preferred 
embodiments of each of these two steps, and a selected embodiment for the first step can be 
freely combined with a selected embodiment for the second step. The methods and 
embodiments that are disclosed herein for each of these steps are merely exemplary, and are 
not intended to limit the scope of the invention in any way. Persons of skill in the art will 
appreciate the specific tools that can be applied to the analysis of a specific TAA, and such 
analysis can be conducted in numerous ways in accordance with the invention. 

[0087] Preferred embodiments for identifying good putative epitopes include the 
use of any available predictive algorithm that analyzes the sequences of proteins or genes to 
predict binding affinity of peptide fragments for MHC, or to rank putative epitopes according 
to predicted affinity or other characteristics associated with MHC binding. As described 
above, available exemplary algorithms for this kind of analysis include the Rammensee and 
NIH (Parker) algorithms. Likewise, good putative epitopes can be identified by direct or 
indirect assays of MHC binding. To choose "good" putative epitopes, it is necessary to set a 
cutoff point in terms of the score reported by the prediction software or in terms of the 
assayed binding affinity. In some embodiments, such a cutoff is absolute. For example, the 
cutoff can be based on the measured or predicted half time of dissociation between an epitope 
and a selected MHC allele. In such cases, embodiments of the cutoff can be any half time of 
dissociation longer than, for example, 0.5 minutes; in a preferred embodiment longer than 2.5 
minutes; in a more preferred embodiment longer than 5 minutes; and in a highly stringent 
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embodiment can be longer than 10, or 20, or 25 minutes. In these embodiments, the good 
putative epitopes are those that are predicted or identified to have good MHC binding 
characteristics, defined as being on the desirable side of the designated cutoff point. 
Likewise, the cutoff can be based on the measured or predicted binding affinity between an 
epitope and a selected MHC allele. Additionally, the absolute cutoff can be simply a selected 
number of putative epitopes. 

[0088] In other embodiments, the cutoff is relative. For example, a selected 
percentage of the total number of putative epitopes can be used to establish the cutoff for 
defining a candidate sequence as a good putative epitope. Again the properties for ranking 
the epitopes are derived from measured or predicted MHC binding; the property used for 
such a determination can be any that is relevant to or indicative of binding. In preferred 
embodiments, identification of good putative epitopes can combine multiple methods of 
ranking candidate sequences. In such embodiments, the good epitopes are typically those that 
either represent a consensus of the good epitopes based on different methods and parameters, 
or that are particularly highly ranked by at least one of the methods. 

[0089] When several good putative epitopes have been identified, their positions 
relative to each other can be analyzed to determine the optimal clusters for use in vaccines or 
in vaccine design. This analysis is based on the density of a selected epitope characteristic 
within the sequence of the TAA. The regions with the highest density of the characteristic, or 
with a density above a certain selected cutoff, are designated as ECRs. Various embodiments 
of the invention employ different characteristics for the density analysis. For example, one 
preferred characteristic is simply the presence of any good putative epitope (as defined by any 
appropriate method). In this embodiment, all putative epitopes above the cutoff are treated 
equally in the density analysis, and the best clusters are those with the highest density of good 
putative epitopes per amino acid residue. In another embodiment, the preferred characteristic 
is based on the parameter(s) previously used to score or rank the putative epitopes. In this 
embodiment, a putative epitope with a score that is twice as high as another putative epitope 
is doubly weighted in the density analysis, relative to the other putative epitope. Still other 
embodiments take the score or rank into account, but on a diminished scale, such as, for 
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example, by using the log or the square root of the score to give more weight to some 
putative epitopes than to others in the density analysis. 

[0090] Depending on the length of the TAA to be analyzed, the number of 
possible candidate epitopes, the number of good putative epitopes, the variability of the 
scoring of the good putative epitopes, and other factors that become evident in any given 
analysis, the various embodiments of the invention can be used alone or in combination to 
identify those ECRs that are most useful for a given application. Iterative or parallel analyses 
employing multiple approaches can be beneficial in many cases. ECRs are tools for 
increased efficiency of identifying true MHC epitopes, and for efficient "packaging" of MHC 
epitopes into vaccines. Accordingly, any of the embodiments described herein, or other 
embodiments that are evident to those of skill in the art based on this disclosure, are useful in 
enhancing the efficiency of these efforts by using ECRs instead of using complete TAAs in 
vaccines and vaccine design. 

[0091] Since many or most TAAs have regions with low density of predicted 
MHC epitopes, using ECRs provides a valuable methodology that avoids the inefficiencies of 
including regions of low epitope density in vaccines and in epitope identification protocols. 
Thus, useful ECRs can also be defined as any portion of a TAA that is not the whole TAA, 
wherein the portion has a higher density of putative epitopes than the whole TAA, or than 
any regions of the TAA that have a particularly low density of putative epitopes. In this 
aspect of the invention, therefore, an ECR can be any fragment of a TAA with elevated 
epitope density. In some embodiments, an ECR can include a region up to about 80% of the 
length of the TAA. In a preferred embodiment, an ECR can include a region up to about 50% 
of the length of the TAA. In a more preferred embodiment, an ECR can include a region up 
to about 30 % of the length of the TAA. And in a most preferred embodiment, an ECR can 
include a region of between 5 and 15% of the length of the TAA. 

[0092] In another aspect of the invention, the ECR can be defined in terms of its 
absolute length. Accordingly, by this definition, the minimal cluster for 9-mer epitopes 
includes 10 amino acid residues and has two overlapping 9-mers with 8 amino acids in 
common. In a preferred embodiment, the cluster is between about 15 and 75 amino acids in 
length. In a more preferred embodiment, the cluster is between about 20 and 60 amino acids 
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in length. In a most preferred embodiment, the cluster is between about 30 and 40 amino 
acids in length. 

[0093] In practice, as described above, ECR identification can employ a simple 
density function such as the number of epitopes divided by the number of amino acids 
spanned by the those epitopes. It is not necessarily required that the epitopes overlap, but the 
value for a single epitope is not significant. If only a single value for a percentage cutoff is 
used and an absolute cutoff in the epitope prediction is not used, it is possible to set a single 
threshold at this step to define a cluster. However, using both an absolute cutoff and carrying 
out the first step using different percentage cutoffs, can produce variations in the global 
density of candidate epitopes. Such variations can require further accounting or 
manipulation. For example, an overlap of 2 epitopes is more significant if only 3 candidate 
epitopes were considered, than if 30 candidates were considered for any particular length 
protein. To take this feature into consideration, the weight given to a particular cluster can 
further be divided by the fraction of possible peptides actually being considered, in order to 
increase the significance of the calculation. This scales the result to the average density of 
predicted epitopes in the parent protein. 

[0094] Similarly, some embodiments base the scoring of good putative epitopes 
on the average number of peptides considered per amino acid in the protein. The resulting 
ratio represents the factor by which the density of predicted epitopes in the putative cluster 
differs from the average density in the protein. Accordingly, an ECR is defined in one 
embodiment as any region containing two or more predicted epitopes for which this ratio 
exceeds 2, that is, any region with twice the average density of epitopes. In other 
embodiments, the region is defined as an ECR if the ratio exceeds 1.5, 3, 4, or 5, or more. 

[0095] Considering the average number of peptides per amino acid in a target 
protein to calculate the presence of an ECR highlights densely populated ECRs without 
regard to the score/affinity of the individual constituents. This is most appropriate for use of 
score-based cutoffs. However, an ECR with only a small number of highly ranked candidates 
can be of more biological significance than a cluster with several densely packed but lower 
ranking candidates, particularly if only a small percentage of the total number of candidate 
peptides were designated as good putative epitopes. Thus in some embodiments it is 
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appropriate to take into consideration the scores of the individual peptides. This is most 
readily accomplished by substituting the sum of the scores of the peptides in the putative 
cluster for the number of peptides in the putative cluster in the calculation described above. 

[0096] This sum of scores method is more sensitive to sparsely populated clusters 
containing high scoring epitopes. Because the wide range of scores (i.e. half times of 
dissociation) produced by the BMAS-NIH/Parker algorithm can lead to a single high scoring 
peptide dwarfing the contribution of other potential epitopes, the log of the score rather than 
the score itself is preferably used in this procedure. 

[0097] Various other calculations can be devised under one or another condition. 
Generally speaking, the epitope density function is constructed so that it is proportional to the 
number of predicted epitopes, their scores, their ranks, and the like, within the putative 
cluster, and inversely proportional to the number of amino acids or fraction of protein 
contained within that putative cluster. Alternatively, the function can be evaluated for a 
window of a selected number of contiguous amino acids. In either case the function is also 
evaluated for all predicted epitopes in the whole protein. If the ratio of values for the putative 
cluster (or window) and the whole protein is greater than, for example, 1.5, 2, 3, 4, 5, or 
more, an ECR is defined. 

Analysis of Target Gene Products For MHC Binding 

[0098] Once a TAA has been identified, the protein sequence can be used to 
identify putative epitopes with known or predicted affinity to the MHC peptide binding cleft. 
Tests of peptide fragments can be conducted in vitro, or using the sequence can be computer 
analyzed to determine MHC receptor binding of the peptide fragments. In one embodiment 
of the invention, peptide fragments based on the amino acid sequence of the target protein are 
analyzed for their predicted ability to bind to the MHC peptide binding cleft. Examples of 
suitable computer algorithms for this purpose include that found at the world wide web page 
of Hans-Georg Rammensee, Jutta Bachmann, Niels Emmerich, Stefan Stevanovic: 
SYFPEITHI: An Internet Database for MHC Ligands and Peptide Motifs (access via 
hypertext transfer protocol: III 34.2.96.22 1/scripts/hlaserver.dll/EpPredict.htm). Results 
obtained from this method are discussed in Rammensee, et al., "MHC Ligands and Peptide 
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Motifs," Landes Bioscience Austin, TX, 224-227, 1997, which is hereby incorporated by 
reference in its entirety. Another site of interest is found at hypertext transfer protocol: 
//bimas.dcrt.nih.gov/molbio/hlajnnd, which also contains a suitable algorithm. The methods 
of this web site are discussed in Parker, et al., "Scheme for ranking potential HLA-A2 
binding peptides based on independent binding of individual peptide side-chains," J. 
Immunol. 152:163-175, which is hereby incorporated by reference in its entirety. 

[0099] As an alternative to predictive algorithms, a number of standard in vitro 
receptor binding affinity assays are available to identify peptides having an affinity for a 
particular allele of MHC. Accordingly, by the method of this aspect of the invention, the 
initial population of peptide fragments can be narrowed to include only putative epitopes 
having an actual or predicted affinity for the selected allele of MHC. Selected common 
alleles of MHC I, and their approximate frequencies, are reported in the tables below. 

Table 1 



Estimated gene frequencies of HLA-A antigens 



Antigen 


CAU 


AFR 


ASI 


LAT 


NAT 


Gf 


SE b 


Gf 


SE 


Gf ; 


SE 


Gf 


SE 


Gf 


SE 


Al 


15.1843 


0.0489 


5.7256 


0.0771 


4.4818 


0.0846 


7.4007 


0.0978 


12.0316 


0.2533 


A2 


28.6535 


0.0619 


18.8849 


0.1317 


24.6352 


0.1794 


28.1198 


0.1700 


29.3408 


0.3585 


A3 


13.3890 


0.0463 


8.4406 


0.0925 


2.6454 


0.0655 


8.0789 


0.1019 


11.0293 


0.2437 


A28 


4.4652 


0.0280 


9.9269 


0.0997 


1.7657 


0.0537 


8.9446 


0.1067 


5.3856 


0.1750 


A36 


0.0221 


0.0020 


1.8836 


0.0448 


0.0148 


0.0049 


0.1584 


0.0148 


0.1545 


0.0303 


A23 


1.8287 


0.0181 


10.2086 


0.1010 


0.3256 


0.0231 


2.9269 


0.0628 


1.9903 


0.1080 


A24 


9.3251 


0.0395 


2.9668 


0.0560 


22.0391 


0.1722 


13.2610 


0.1271 


12.6613 


0.2590 


A9 unsplit 


0.0809 


0.0038 


0.0367 


0.0063 


0.0858 


0.0119 


0.0537 


0.0086 


0.0356 


0.0145 


A9 total 


11.2347 


0.0429 


13.2121 
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22.4505 


0.1733 


16.2416 


0.1382 


14.6872 


0.2756 
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2.1157 


0.0195 


0.4329 I 


0.0216 


0.0990 


0.0128 


1.1937 


0.0404 


1.4520 


0.0924 


A26 


3.8795 


0.0262 


2.8284 


0.0547 


4.6628 


0.0862 


3.2612 


0.0662 


2.4292 


0.1191 


A34 


0.1508 


0.0052 


3.5228 1 


0.0610 


1.3529 


0.0470 


0.4928 


0.0260 


0.3150 


0.0432 


A43 


0.0018 


0.0006 


0.0334 


0.0060 


0.0231 


0.0062 


0.0055 


0.0028 


0.0059 


0.0059 


A66 


0.0173 


0.0018 


0.2233 


0.0155 


0.0478 


0.0089 


0.0399 


0.0074 


0.0534 


0.0178 


A10 unsplit 


0.0790 


0.0038 


0.0939 


0.0101 


0.1255 


0.0144 


0.0647 


0.0094 


0.0298 


0.0133 


A10 total 


6.2441 


0.0328 


7.1348 


0.0850 


6.3111 


0.0993 


5.0578 


0.0816 


4.2853 


0.1565 


A29 


3.5796 


0.0252 


3.2071 


0.0582 


1.1233 


0.0429 


4.5156 


0.0774 


3.4345 


0.1410 


A30 


2.5067 


0.0212 


13.0969 


0.1129 


2.2025 


0.0598 


4.4873 


0.0772 


2.5314 


0.1215 


A31 


2.7386 


0.0221 


1.6556 


0.0420 


3.6005 


0.0761 


4.8328 


0.0800 


6.0881 


0.1855 


A32 


3.6956 


0.0256 


1.5384 


0.0405 


1.0331 


0.0411 


2.7064 


0.0604 


2.5521 


0.1220 


A33 


1.2080 


0.0148 


6.5607 


0.0822 


9.2701 


0.1191 


2.6593 


0.0599 


1.0754 


0.0796 


A74 


0.0277 


0.0022 


1.9949 


0.0461 


0.0561 


0.0096 


0.2027 


0.0167 


0.1068 


0.0252 


A19 unsplit 


0.0567 


0.0032 


0.2057 


0.0149 


0.0990 


0.0128 


0.1211 


0.0129 


0.0475 


0.0168 


A19 total 


13.8129 


0.0468 


28.2593 


0.1504 


17.3846 


0.1555 


19.5252 


0.1481 


15.8358 


0.2832 


AX 


0.8204 


0.0297 


4.9506 


0.0963 


2.9916 


0.1177 


1.6332 


0.0878 


1.8454 


0.1925 
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a Gene frequency. 
b Standard error. 



Table 2 



Estimated gene frequencies for HLA-B antigens 
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B8 
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0.8103 
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0.0566 
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0.0567 
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0.0159 


0.5916 


0.0252 


1.2327 


0.0449 


0.7807 


0.0327 


0.9755 


0.0759 


B41 
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0.0129 
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0.0296 


0.1303 


0.0147 


1.2818 


0.0418 


0.4766 


0.0531 


B42 


0.0608 1 


0.0033 


5.6991 


0.0768 


0.0841 


0.0118 


0.5866 


0.0284 


0.2856 


0.0411 


B46 


0.0099 


0.0013 


0.0151 


0.0040 
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0.0886 


0.0234 


0.0057 


0.0238 


0.0119 


B47 
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0.0798 
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ft ftftlQ 
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0 0055 
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0.0108 


0.0014 


0.0032 


0.0019 


0.0132 


0.0047 


0.0261 


0.0060 
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B51 


5.4215 


0.0307 


2.5980 


0.0525 


7.4751 


0.1080 


6.8147 


0.0943 


6.9077 


0.1968 


B52 
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0.0132 


1.3712 


0.0383 


3.5121 


0.0752 


2.2447 


0.0552 


0.6960 


0.0641 


B5 unsplit 
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0.0053 


0.1522 


0.0128 


0.1288 


0.0146 


0.1546 


0.0146 


0.1307 


0.0278 


B5 total 


6.5438 


0.0435 


4.1214 


0.0747 


11.1160 


0.1504 


9.2141 


0.1324 


7.7344 


0.2784 


B44 


13.4838 


0.0465 


7.0137 


0.0847 


5.6807 


0.0948 


9.9253 


0.1121 


11.8024 


0.2511 


B45 


0.5771 


0.0102 


4.8069 


0.0708 


0.1816 


0.0173 


1.8812 


0.0506 


0.7603 


0.0670 


B 12 unsplit 


0.0788 


0.0038 


0.0280 


0.0055 


0.0049 


0.0029 


0.0193 


0.0051 


0.0654 


0.0197 


B12 total 


14.1440 


0.0474 


11.8486 


0.1072 


5.8673 


0.0963 


11.8258 


0.1210 


12.6281 


0.2584 


n^7 


S Q1 1 7 


ft <Y*7ft 


1 S767 


ft ft4ft4 


9 2249 


0 1 190 


4 1825 


0.0747 


6.9421 

0.3738 


0.1973 


B63 


0.4302 


0.0088 


1.8865 


0.0448 


0.4438 


0.0270 


0.8083 


0.0333 


0.0356 


0.0471 


B75 


0.0104 


0.0014 


0.0226 


0.0049 


1.9673 


0.0566 


0.1101 


0.0123 


0 


0.0145 


B76 


0.0026 


0.0007 


0.0065 


0.0026 


0.0874 


0.0120 


0.0055 


0.0028 


0 C 




B77 


0.0057 


0.0010 


0.0119 


0.0036 


0.0577 


0.0098 


0.0083 


0.0034 


0.0059 


0.0059 1 


Bl 5 unsplit 


0.1305 


0.0049 


0.0691 


0.0086 


0.4301 


0.0266 


0.1820 


0.0158 


0.0715 


0.0206 


Bl 5 total 


6.4910 


0.0334 


3.5232 


0.0608 


12.2112 


0.1344 


5.2967 


0.0835 


7.4290 


0.2035 


B38 


2.4413 


0.0209 


0.3323 


0.0189 


3.2818 


0.0728 


1.9652 


0.0517 


1.1017 


0.0806 


B39 


1.9614 


0.0188 


1.2893 


0.0371 


2.0352 


0.0576 


6.3040 


0.0909 


4.5527 


0.1615 


B16 unsplit 


0.0638 


0.0034 


0.0237 


0.0051 


0.0644 


0.0103 


0.1226 


0.0130 


0.0593 


0.0188 


B 16 total 


4.4667 


0.0280 


1.6453 


0.0419 


5.3814 


0.0921 


8.3917 


0.1036 


5.7137 


0.1797 


B57 


3.5955 


0.0252 


5.6746 


0.0766 


2.5782 


0.0647 


2.1800 


0.0544 


2.7265 


0.1260 


B58 


0.7152 


0.0114 


5.9546 


0.0784 


4.0189 


0.0803 


1.2481 


0.0413 


0.9398 


0.0745 


B17 unsplit 


0.2845 


0.0072 


0.3248 


0.0187 


0.3751 


0.0248 


0.1446 


0.0141 


0.2674 


0.0398 


B17 total 


4.5952 


0.0284 


11.9540 


0.1076 


6.9722 


0.1041 


3.5727 


0.0691 


3.9338 


0.1503 


B49 


1.6452 


0.0172 


2.6286 


0.0528 


0.2440 


0.0200 


2.3353 


0.0562 


1.5462 


0.0953 
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Antigen 


CAU 


APR 


ASI 


LAT 


NAT 


Gf 


SE b 


Gf 


SE 


Gf 


SE 


Gf 


SE 


Gf 


SE 


B50 


1.0580 


0.0138 


0.8636 


0.0304 


0.4421 


0.0270 


1.8883 


0.0507 


0.7862 


0.0681 


B21 unsplit 


0.0702 


0.0036 


0.0270 


0.0054 


0.0132 


0.0047 


0.0771 


0.0103 


0.0356 


0.0145 


B21 total 


2.7733 


0.0222 


3.5192 


0.0608 


0.6993 


0.0339 


4.3007 


0.0755 


2.3680 


0.1174 


B54 


0.0124 


0.0015 


0.0183 


0.0044 


2.6873 


0.0660 


0.0289 


0.0063 


0.0534 


0.0178 


B55 


1.9046 


0.0185 


0.4895 


0.0229 


2.2444 


0.0604 


0.9515 


0.0361 


1.4054 


0.0909 


B56 


0.5527 


0.0100 


0.2686 


0.0170 


0.8260 


0.0368 


0.3596 


0.0222 


0.3387 


0.0448 


B22 unsplit 


0.1682 


0.0055 


0.0496 


0.0073 


0.2730 


0.0212 


0.0372 


0.0071 


0.1246 


0.0272 


B22 total 


2.0852 


0.0217 


0.8261 


0.0297 


6.0307 


0.0971 


1.3771 


0.0433 


1.9221 


0.1060 


B60 


5.2222 


0.0302 


1.5299 


0.0404 


8.3254 


0.1135 


2.2538 


0.0553 


5.7218 


0.1801 


B61 


1.1916 


0.0147 


0.4709 


0.0225 


6.2072 


0.0989 


4.6691 


0.0788 


2.6023 


0.1231 


B40 unsplit 


0.2696 


0.0070 


0.0388 


0.0065 


0.3205 


0.0230 


0.2473 


0.0184 


0.2271 


0.0367 


B40 total 


6.6834 


0.0338 


2.0396 


0.0465 


14.8531 


0.1462 


7.1702 


0.0963 


8.5512 


0.2168 


BX 


1.0922 


0.0252 


3.5258 


0.0802 


3.8749 


0.0988 


2.5266 


0.0807 


1.9867 


0.1634 



a Gene frequency. b Standard error. °The observed gene count was 



zero. 



Table 3 



Estimated gene frequencies of HLA-DR antigens 



Antigen 


CAU 


AFR 


ASI 


LAT 


NAT 


Gf 


SE" 


Gf 


SE 


Gf 


SE 


Gf 


SE 


Gf 


SE 


DR1 


10.2279 


0.0413 


6.8200 




0.0832 


3.4628 


0.0747 


7.9859 


0.1013 


8.2512 


0.2139 


DR2 


15.2408 


0.0491 


16.2373 




0.1222 


18.6162 


0.1608 


11.2389 


0.1182 


15.3932 


0.2818 


DR3 


10.8708 


0.0424 


13.3080 




0.1124 


4.7223 


0.0867 


7.8998 


0.1008 


10.2549 


0.2361 


DR4 


16.7589 


0.0511 


5.7084 




0.0765 


15.4623 


0.1490 


20.5373 


0.1520 


19.8264 


0.3123 


DR6 


14.3937 


0.0479 


18.6117 




0.1291 


13.4471 


0.1404 


17.0265 


0.1411 


14.8021 


0.2772 


DR7 


13.2807 


0.0463 


10.1317 




0.0997 


6.9270 


0.1040 


10.6726 


0.1155 


10.4219 


0.2378 


DR8 


2.8820 


0.0227 


6.2673 




0.0800 


6.5413 


0.1013 


9.7731 


0.1110 


6.0059 


0.1844 


DR9 


1.0616 


0.0139 


2.9646 




0.0559 


9.7527 


0.1218 


1.0712 


0.0383 


2.8662 


0.1291 


DR10 


1.4790 


0.0163 


2.0397 




0.0465 


2.2304 


0.0602 


1.8044 


0.0495 


1.0896 


0.0801 


DR11 


9.3180 


0.0396 


10.6151 


0.1018 


4.7375 


0.0869 


7.0411 


0.0955 


5.3152 


0.1740 


DR12 


1.9070 


0.0185 


4.1152 


0.0655 


10.1365 


0.1239 


1.7244 


0.0484 


2.0132 


0.1086 


DR5 unsplit 


1.2199 


0.0149 


2.2957 


0.0493 


1.4118 


0.0480 


1.8225 


0.0498 


1.6769 


0.0992 


DR5 total 


12.4449 


0.0045 


17.0260 


0.1243 


16.2858 


0.1516 


10.5880 


0.1148 


9.0052 


0.2218 


DRX 


1.3598 


0.0342 


0.8853 


0.0760 


2.5521 


0.1089 


1.4023 


0.0930 


2.0834 


0.2037 



a Gene frequency. 
b Standard error. 



[0105] It has been observed that predicted epitopes often cluster at one or more 
particular regions within the amino acid sequence of a TAA. The identification of such 
ECRs offers a simple and practicable solution to the problem of designing effective vaccines 
for stimulating cellular immunity. For vaccines in which immune epitopes are desired, an 
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ECR is directly useful as a vaccine. This is because the immune proteasomes of the pAPCs 
can correctly process the cluster, liberating one or more of the contained MHC-binding 
peptides, in the same way a cell having immune proteasomes activity processes and presents 
peptides derived from the complete TAA. The cluster is also a useful a starting material for 
identification of housekeeping epitopes produced by the housekeeping proteasomes active in 
peripheral cells. 

[0106] Identification of housekeeping epitopes using ECRs as a starting material 
is described in copending U.S. Patent Application No. 09/561,074 entitled "METHOD OF 
EPITOPE DISCOVERY," filed April 28, 2000, which is incorporated herein by reference in 
its entirety. Epitope synchronization technology and vaccines for use in connection with this 
invention are disclosed in copending U.S. Patent Application No. 09/560,465 entitled 
"EPITOPE SYNCHRONIZATION IN ANTIGEN PRESENTING CELLS," filed April 28, 
2000, which is incorporated herein by reference in its entirety. Nucleic acid constructs useful 
as vaccines in accordance with the present invention are disclosed in copending U.S. Patent 
Application No. 09/561,572 entitled "EXPRESSION VECTORS ENCODING EPITOPES 
OF TARGET-ASSOCIATED ANTIGENS," filed April 28, 2000, which is incorporated 
herein by reference in its entirety. 

Vector Design and Vectors 

[0107] Degradation of cytosolic proteins takes place via the ubiquitin-dependent 
multi-catalytic multi-subunit protease system known as the proteasome. The proteasome 
degrades cytosolic proteins generating fragments that can then be translocated from the 
cytosol into the endoplasmic reticulum (ER) for loading onto class I MHC. Such protein 
fragments shall be referred to as class I peptides. The peptide loaded MHC are subsequently 
transported to the cell surface where they can be detected by CTL. 

[0108] The multi-catalytic activity of the proteasome is the result of its multi- 
subunit structure. Subunits are expressed from different genes and assembled post- 
translationally into the proteasome complex. A key feature of the proteasome is its bimodal 
activity, which enables it to exert its protease, or cleavage function-, with two discrete kinds 
of cleavage patterns. This bimodal action of the proteasome is extremely fundamental to 
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understanding how CTL are targeted to recognize peripheral cells in the body and how this 
targeting requires synchronization between the immune system and the targeted cells. 

[0109] The housekeeping proteasome is constitutively active in all peripheral cells 
and tissues of the body. The first mode of operation for the housekeeping proteasome is to 
degrade cellular protein, recycling it into amino acids. Proteasome function is therefore a 
necessary activity for cell life. As a corollary to its housekeeping protease activity, however, 
class I peptides generated by the housekeeping proteasome are presented on all of the 
peripheral cells of the body. 

[0110] The proteasome's second mode of function is highly exclusive and occurs 
specifically in pAPCs or as a consequence of a cellular response to interferons (IFNs). In its 
second mode of activity the proteasome incorporates unique subunits, which replace the 
catalytic subunits of the constitutive housekeeping proteasome. This "modified" proteasome 
has been called the immunoproteasome, owing to its expression in pAPC and as a 
consequence of induction by IFN in body cells. 

[0111] APC define the repertoire of CTL that recirculate through the body and are 
potentially active as killer cells. CTL are activated by interacting with class I peptide 
presented on the surface of a pAPC. Activated CTL are induced to proliferate and caused to 
recirculate through the body in search of diseased cells. This is why the CTL response in the 
body is defined specifically by the class I peptides produced by the pAPC. It is important to 
remember that pAPCs express the immunoproteasome, and that as a consequence of the 
bimodal activity of the proteasome, the cleavage pattern of proteins (and the resultant class I 
peptides produced) are different from those in peripheral body cells which express 
housekeeping proteasome. The differential proteasome activity in pAPC and peripheral body 
cells, therefore, is important to consider during natural infection and with therapeutic CTL 
vaccination strategies. 

[0112] All cells of the body are capable of producing IFN in the event that they 
are infected by a pathogen such as a virus. IFN production in turn results in the expression of 
the immunoproteasome in the infected cell. Viral antigens are thereby processed by the 
immunoproteasome of the infected cell and the consequent peptides are displayed with class I 
MHC on the cell surface. At the same time, pAPC are sequestering virus antigens and are 
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processing class I peptides with their immunoproteasome activity, which is normal for the 
pAPC cell type. The CTL response in the body is being stimulated specifically by the class I 
peptides produced by the pAPC. Fortunately, the infected cell is also producing class I 
peptides from the immunoproteasome, rather than the normal housekeeping proteasome. 
Thus, virus-related class I peptides are being produced that enable detection by the ensuing 
CTL response. The CTL immune response is induced by pAPC, which normally produce 
different class I peptides compared to peripheral body cells, owing to different proteasome 
activity. Therefore, during infection there is epitope synchronization between the infected 
cell and the immune system. 

[0113] This is not the case with tumors and chronic viruses, which block the 
interferon system. For tumors there is no infection in the tumor cell to induce the 
immunoproteasome expression, and chronic virus infection either directly or indirectly blocks 
immunoproteasome expression. In both cases the diseased cell maintains its display of class I 
peptides derived from housekeeping proteasome activity and avoids effective surveillance by 
CTL. 

[0114] In the case of therapeutic vaccination to eradicate tumors or chronic 
infections, the bimodal function of the proteasome and its differential activity in APC and 
peripheral cells of the body is significant. Upon vaccination with protein antigen, and before 
a CTL response can occur, the antigen must be acquired and processed into peptides that are 
subsequently presented on class I MHC on the pAPC surface. The activated CTL recirculate 
in search of cells with similar class I peptide on the surface. Cells with this peptide will be 
subjected to destruction by the cytolytic activity of the CTL. If the targeted diseased cell does 
not express the immunoproteasome, which is present in the pAPC, then the epitopes are not 
synchronized and CTL fail to find the desired peptide target on the surface of the diseased 
cell. 

[0115] Preferably, therapeutic vaccine design takes into account the class I 
peptide that is actually present on the target tissue. That is, effective antigens used to 
stimulate CTL to attack diseased tissue are those that are naturally processed and presented 
on the surface of the diseased tissue. For tumors and chronic infection this generally means 
that the CTL epitopes are those that have been processed by the housekeeping proteasome. 
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In order to generate an effective therapeutic vaccine, CTL epitopes are identified based on the 
knowledge that such epitopes are, in fact, produced by the housekeeping proteasome system. 
Once identified, these epitopes, embodied as peptides, can be used to successfully immunize 
or induce therapeutic CTL responses against housekeeping proteasome expressing target cells 
in the host. 

[0116] However, in the case of DNA vaccines, there can be an additional 
consideration. The immunization with DNA requires that APCs take up the DNA and 
express the encoded proteins or peptides. It is possible to encode a discrete class I peptide on 
the DNA. By immunizing with this construct, APCs can be caused to express a 
housekeeping epitope, which is then displayed on class I MHC on the surface of the cell for 
stimulating an appropriate CTL response. Constructs for generation of proper termini of 
housekeeping epitopes have been described in U.S. Patent application No. 09/561,572 
entitled EXPRESSION VECTORS ENCODING EPITOPES OF TARGET- ASSOCIATED 
ANTIGENS, filed on April 28, 2000, which is incorporated herein by reference in its entirety. 

[0117] Embodiments of the invention provide expression cassettes that encode 
one or more embedded housekeeping epitopes, and methods for designing and testing such 
expression cassettes. The expression cassettes and constructs can encode epitopes, including 
housekeeping epitopes, derived from antigens that are associated with targets. Housekeeping 
epitopes can be liberated from the translation product(s) of the cassettes. For example, in 
some embodiments of the invention, the housekeeping epitope(s) can be flanked by arbitrary 
sequences or by sequences incorporating residues known to be favored in immunoproteasome 
cleavage sites. In further embodiments of the invention multiple epitopes can be arrayed 
head-to-tail. In some embodiments, these arrays can be made up entirely of housekeeping 
epitopes. Likewise, the arrays can include alternating housekeeping and immune epitopes. 
Alternatively, the arrays can include housekeeping epitopes flanked by immune epitopes, 
whether complete or distally truncated. In some preferred embodiments, each housekeeping 
epitope can be flanked on either side by an immune epitope, such that an array of such 
arrangements has two immune epitopes between each housekeeping epitope. Further, the 
arrays can be of any other similar arrangement. There is no restriction on placing a 
housekeeping epitope at the terminal positions of the array. The vectors can additionally 
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contain authentic protein coding sequences or segments thereof containing epitope clusters as 
a source of immune epitopes. 

[0118] Several disclosures make reference to polyepitopes or string-of-bead 
arrays. See, for example, WO0119408A1, March 22, 2001; WO9955730A2, November 4, 
1999; WO0040261A2, July 13, 2000; WO9603144A1, February 8, 1996; EP1181314A1, 
February 27, 2002; WO0123577A3, April 5; US6074817, June 13, 2000; US5965381, 
October 12, 1999; WO9741440A1, November 6, 1997; US6130066, October 10, 2000; 
US6004777, December 21, 1999; US5990091, November 23, 1999; WO9840501A1, 
September 17, 1998; WO9840500A1, September 17, 1998; WO0118035A2, March 15, 
2001; WO02068654A2, September 6, 2002; WO0189281A2, November 29, 2001; 
WO0158478A, August 16, 2001; EP1118860A1, July 25, 2001; WO0111040A1, February 
15, 2001; WO0073438A1, December 7, 2000; WO0071158A1, November 30, 2000; 
WO0066727A1, November 9, 2000; WO0052451A1, September 8, 2000; WO0052157A1, 
September 8, 2000; WO0029008A2, May 25, 2000; WO0006723A1, February 10, 2000; all 
of which are incorporated by reference in their entirety. Additional disclosures, all of which 
are hereby incorporated by reference in their entirety, include Palmowski MJ, et al - J 
Immunol 2002;168(9):4391-8; Fang ZY, et al - Virology 2001;291(2):272-84; Firat H, et al - 
J Gene Med 2002;4(l):38-45; Smith SG, et al - Clin Cancer Res 2001 ;7(12):4253-61; 
Vonderheide RH, et al - Clin Cancer Res 2001; 7(1 1):3343-8; Firat H, et al - Eur J Immunol 
2001;31(10):3064-74; Le TT, et al - Vaccine 2001;19(32):4669-75; Fayolle C, et al - J Virol 
2001;75(16):7330-8; Smith SG - Curr Opin Mol Ther 1999;l(l):10-5; Firat H, et al - Eur J 
Immunol 1999;29(10):31 12-21; Mateo L, et al - J Immunol 1999;163(7):4058-63; 
Heemskerk MH, et al - Cell Immunol 1999;195(l):10-7; Woodberry T, et al - J Virol 
1999;73(7):5320-5; Hanke T, et al - Vaccine 1998;16(4):426-35; Thomson SA, et al - J 
Immunol 1998;160(4):1717-23; Toes RE, et al - Proc Natl Acad Sci USA 
1997;94(26): 14660-5; Thomson SA, et al - J Immunol 1996; 157(2): 822-6; Thomson SA, et 
al - Proc Natl Acad Sci USA 1995;92(13):5845-9; Street MD, et al - Immunology 
2002;106(4):526-36; Hirano K, et al - Histochem Cell Biol 2002;117(l):41-53; Ward SM, et 
al - Virus Genes 2001 ;23(1):97- 104; Liu WJ, et al - Virology 2000;273(2):374-82; Gariglio 
P, et al - Arch Med Res 1998;29(4):279-84; Suhrbier A - Immunol Cell Biol 1997;75(4):402- 
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8; Fomsgaard A, et al - Vaccine 1999;18(7-8):681-91; An LL, et al - J Virol 
1997;71(3):2292-302; Whitton JL, et al - J Virol 1993;67(l):348-52; Ripalti A, et al - J Clin 
Microbiol 1994;32(2):358-63; and Gilbert, S.C., et al., Nat. Biotech. 15:1280-1284, 1997. 

[0119] One important feature that the disclosures in the preceding paragraph all 
share is their lack of appreciation for the desirability of regenerating housekeeping epitopes 
when the construct is expressed in a pAPC. This understanding was not apparent until the 
present invention. Embodiments of the invention include sequences, that when processed by 
an immune proteasome, liberate or generate a housekeeping epitope. Embodiments of the 
invention also can liberate or generate such epitopes in immunogenically effective amounts. 
Accordingly, while the preceding references contain disclosures relating to polyepitope 
arrays, none is enabling of the technology necessary to provide or select a polyepitope 
capable of liberating a housekeeping epitope by action of an immunoproteasome in a pAPC. 
In contrast, embodiments of the instant invention are based upon a recognition of the 
desirability of achieving this result. Accordingly, embodiments of the instant invention 
include any nucleic acid construct that encodes a polypeptide containing at least one 
housekeeping epitope provided in a context that promotes its generation via 
immunoproteasomal activity, whether the housekeeping epitope is embedded in a string-of- 
beads array or some other arrangement. Some embodiments of the invention include uses of 
one or more of the nucleic acid constructs or their products that are specifically disclosed in 
any one or more of the above-listed references. Such uses include, for example, screening a 
polyepitope for proper liberation context of a housekeeping epitope and/or an immune 
epitope, designing an effective immunogen capable of causing presentation of a 
housekeeping epitope and/or an immune epitope on a pAPC, immunizing a patient, and the 
like. Alternative embodiments include use of only a subset of such nucleic acid constructs or 
a single such construct, while specifically excluding one or more other such constructs, for 
any of the purposes disclosed herein. Some preferred embodiments employ these and/or 
other nucleic acid sequences encoding polyepitope arrays alone or in combination. For 
example, some embodiments exclude use of polyepitope arrays from one or more of the 
above-mentioned references. Other embodiments may exclude any combination or all of the 
polyepitope arrays from the above-mentioned references collectively. Some embodiments 
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include viral and/or bacterial vectors encoding polyepitope arrays, while other embodiments 
specifically exclude such vectors. Such vectors can encode carrier proteins that may have 
some immunostimulatory effect. Some embodiments include such vectors with such 
immunostimulatory/immunopotentiating effects, as opposed to immunogenic effects, while in 
other embodiments such vectors may be included. Further, in some instances viral and 
bacterial vectors encode the desired epitope as a part of substantially complete proteins which 
are not associated with the target cell. Such vectors and products are included in some 
embodiments, while excluded from others. Some embodiments relate to repeated 
administration of vectors. In some of those embodiments, nonviral and nonbacterial vectors 
are included. Likewise, some embodiments include arrays that contain extra amino acids 
between epitopes, for example anywhere from 1-6 amino acids, or more, in some 
embodiments, while other embodiments specifically exclude such arrays. 

[0120] Embodiments of the present invention also include methods, uses, 
therapies, and compositions directed to various types of targets. Such targets can include, for 
example, neoplastic cells such as those listed below, for example; and cells infected with any 
virus, bacterium, protozoan, fungus, or other agents, examples of which are listed below, in 
Tables 4-8, or which are disclosed in any of the references listed above. Alternative 
embodiments include the use of only a subset of such neoplastic cells and infected cells listed 
below, in Tables 4-8, or in any of the references disclosed herein, or a single one of the 
neoplastic cells or infected cells, while specifically excluding one or more other such 
neoplastic cells or infected cells, for any of the purposes disclosed herein. The following are 
examples of neoplastic cells that can be targeted: human sarcomas and carcinomas, e.g., 
fibrosarcoma, myxosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, 
angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, 
synovioma, mesothelioma, Ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, colon 
carcinoma, pancreatic cancer, breast cancer, ovarian cancer, prostate cancer, squamous cell 
carcinoma, basal cell carcinoma, adenocarcinoma, sweat gland carcinoma, sebaceous gland 
carcinoma, papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma, medullary 
carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, 
choriocarcinoma, seminoma, embryonal carcinoma, Wilms 1 tumor, cervical cancer, testicular 
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tumor, lung carcinoma, small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, 
glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, 
hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, 
neuroblastoma, retinoblastoma; leukemias, e.g., acute lymphocytic leukemia and acute 
myelocytic leukemia (myeloblastic, promyelocyte, myelomonocytic, monocytic and 
erythroleukemia); chronic leukemia (chronic myelocytic (granulocytic) leukemia and chronic 
lymphocytic leukemia); and polycythemia vera, lymphoma (Hodgkiris disease and non 
Hodgkin's disease), multiple myeloma, Waldenstrom's macroglobulinemia, heavy chain 
disease, hepatocellular cancer, brain cancer, stomach cancer, liver cancer, and the like. 
Examples of infectious agents that infect the target cells can include the following: 
adenovirus, cytomegalovirus, Epstein-Barr virus, herpes simplex virus 1, herpes simplex 
virus 2, human herpesvirus 6, varicella-zoster virus, hepatitis B virus, hepatitis D virus, 
papilloma virus, parvovirus B19, polyomavirus BK, polyomavirus JC, hepatitis C virus, 
measles virus, rubella virus, human immunodeficiency virus (HIV), human T cell leukemia 
virus I, human Tcell leukemia virus n, Chlamydia, Listeria, Salmonella, Legionella, 
Brucella, Coxiella, Rickettsia, Mycobacterium, Leishmania, Trypanasoma, Toxoplasma, 
Plasmodium, and the like. Exemplary infectious agents and neoplastic cells are also included 
in Tables 4-8 below. 

[0121] Furthermore the targets can include neoplastic cells described in or cells 
infected by agents that are described in any of the following references: Jager, E. et al., 
"Granulocyte-macrophage-colony-stimulating factor enhances immune responses to 
melanoma-associated peptides in vivo," Int. J Cancer, 67:54-62 (1996); Kundig, T.M., 
Althage, A., Hengartner, H. & Zinkernagel, R.M., "A skin test to assess CD8+ cytotoxic T 
cell activity," Proc. Natl. Acad Sci. USA, 89:7757-76 (1992); Bachmann, M.F. & Kundig, 
T.M., "In vitro vs. in vivo assays for the assessment of T- and B-cell function," Curr. Opin. 
Immunol, 6:320-326 (1994); Kundig et al., "On the role of antigen in maintaining cytotoxic 
T cell memory," Proceedings of the National Academy of Sciences of the United States of 
America, 93:9716-23 (1996); Steinmann, R.M., "The dendritic cells system and its role in 
irnmunogenicity," Annual Review of Immunology 9:271-96 (1991); Inaba, K. et al., 
"Identification of proliferating dendritic cell precursors in mouse blood," Journal of 
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Experimental Medicine, 175:1157-67 (1992); Young, J.W. & Inaba, K., "Dendritic cells as 
adjuvants for class I major histocompatibility complex-restricted anti-tumor immunity," 
Journal of Experimental Medicine, 183:7-11 (1996); Kuby, Janis, Immunology, Second 
Edition, Chapter 15, W.H. Freeman and Company (1991); Austenst, E., Stahl, T., and de 
Gruyter, Walter, Insulin Pump Therapy, Chapter 3, Berlin, New York (1990); Remington, 
The Science and Practice of Pharmacy, Nineteenth Edition, Chapters 86-88 (1985); Cleland, 
Jeffery L. and Langer, Robert (Editor), "Formulation and delivery of proteins and peptides," 
American Chemical Society (ACS Symposium Series, No. 567) (1994); Dickinson, Becton, 
which is fixed using Tegadenn transparent dressing Tegaderm™ 1624, 3M, St. Paul, MN 
55144, USA; Santus, Giancarlo and Baker, Richard, "Osmotic drug delivery: A review of the 
patent literature," Journal of Controlled Release, 35:1-21 (1995); Rammensee, U.S. Patent 
No. 5,747,269, issued May 05, 1998; Magruder, U.S. Patent No. 5,059,423, issued October 
22, 1991; Sandbrook, U.S. Patent No. 4,552,651, issued November 25, 1985; Eckenhoff et 
al, U.S. Patent No. 3,987,790, issued October 26, 1976; Theeuwes, U.S. Patent No. 
4,455,145, issued June 19, 1984; Roth et al. U.S. Patent No. 4,929,233, issued May 29 1990; 
van der Bruggen et al., U.S. Patent No. 5,554,506, issued September 10, 1996; 
Pfreundschuh, U.S. Patent No. 5,698,396, issued December 16, 1997; Magruder, U.S. Patent 
No. 5,110,596, issued May 5, 1992; Eckenhoff, U.S. Patent No. 4,619,652, issued October 
28, 1986; Higuchi et al., U.S. Patent No. 3,995,631, issued December 07, 1976; Maruyama, 
U.S. Patent No. 5,017,381, issued May 21, 1991; Eckenhoff, U.S. Patent No. 4,963,141, 
issued October 16, 1990; van der Bruggen et al., U.S. Patent No. 5,558,995, issued 
September 24, 1996; Stolzenberg et al. U.S. Patent No. 3,604,417, issued September 14, 
1971; Wong et al., U.S. Patent No. 5,110,597, issued May 05, 1992; Eckenhoff, U.S. Patent 
No. 4,753,651, issued June 28, 1988; Theeuwes, U.S. Patent No. 4,203,440, issued May 20, 
1980; Wong et al. U.S. Patent No. 5,023,088, issued June 11, 1991; Wong et al., U.S. Patent 
No. 4,976,966, issued December 11, 1990; Van den Eynde et al., U.S. Patent No. 5,648,226, 
issued July 15, 1997; Baker et al., U.S. Patent No. 4,838,862, issued June 13, 1989; 
Magruder, U.S. Patent No. 5,135,523, issued August 04, 1992; Higuchi et al., U.S. Patent 
No. 3,732,865, issued May 15, 1975, ; Theeuwes, U.S. Patent No. 4,286,067, issued August, 
25 1981; Theeuwes et al., U.S. Patent No. 5,030,216, issued July 09, 1991; Boon et al., U.S. 
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Patent No. 5,405,940, issued April 11, 1995; Faste, U.S. Patent No. 4,898,582, issued 
February 06, 1990; Eckenhoff, U.S. Patent No. 5,137,727, issued August 11, 1992; Higuchi 
et al., U.S. Patent No. 3,760,804, issued September 25, 1973; Eckenhoff et al., U.S. Patent 
No. 4,300,558, issued November 12, 1981; Magruder et al., U.S. Patent No. 5,034,229, 
issued July 23, 1991; Boon et al., U.S. Patent No. 5,487,974, issued January 30, 1996; Kam 
et al., U.S. Patent No. 5,135,498, issued August 04, 1992; Magruder et al., U.S. Patent No. 
5,174,999, issued December 29, 1992; Higuchi, U.S. Patent No. 3,760,805, September 25, 
1973; Michaels, U.S. Patent No. 4,304,232, issued December 08, 1981; Magruder et al., 
U.S. Patent No. 5,037,420, issued October 15, 1991; Wolfel et al., U.S. Patent No. 
5,530,096, issued June 25, 1996; Athadye et al, U.S. Patent No. 5,169,390, issued December 
08, 1992; Balaban et al., U.S. Patent No. 5,209,746, issued May 11, 1993; Higuchi, U.S. 
Patent No. 3,929,132, issued December 30, 1975; Michaels, U.S. Patent No. 4,340,054, 
issued July 20, 1982; Magruder et al., U.S. Patent No. 5,057,318, issued October 15, 1991; 
Wolfel et al., U.S. Patent No. 5,519,117, issued May 21, 1996; Athadye et al., U.S. Patent 
No. 5,257,987, issued November 02, 1993; Linkwitz et al., U.S. Patent No. 5,221,278, issued 
June 22, 1993; Nakano et al., U.S. Patent No. 3,995,632, issued December 07, 1976; 
Michaels, U.S. Patent No. 4,367,741, issued January 11, 1983; Eckenhoff, U.S. Patent No. 
4,865,598, issued September 12, 1989; Lethe et al., U.S. Patent No. 5,774,316, issued April 
28, 1998; Eckenhoff, U.S. Patent No. 4,340,048, issued July 20, 1982; Wong, U.S. Patent 
No. 5,223,265, issued June 29, 1993; Higuchi et al., U.S. Patent No. 4,034,756, issued July 
12, 1977; Michaels, U.S. Patent No. 4,450,198, issued May 22, 1984; Eckenhoff et al, U.S. 
Patent No. 4,865,845, issued September 12, 1989; Melief et. al., U.S. Patent No. 5,554,724, 
issued September 10, 1996; Eckenhoff et al., U.S. Patent No. 4,474,575, issued October 02, 
1984; Theeuwes, U.S. Patent No. 3,760,984, issued September 25, 1983; Eckenhoff, U.S. 
Patent No. 4,350,271, issued September 21, 1982; Eckenhoff et al., U.S. Patent No. 
4,855,141, issued August 08, 1989; Zingerman, U.S. Patent No. 4,872,873, issued October 
10, 1989; Townsend et al., U.S. Patent No. 5,585,461, issued December 17, 1996; Carulli, 
J.P. et al., J. Cellular Biochem Suppl., 30/31:286-96 (1998); TUreci, 0., Sahin, U, and 
Pfreundschuh, M., "Serological analysis of human tumor antigens: molecular definition and 
implications," Molecular Medicine Today, 3:342 (1997); Rammensee et al., MHC Ligands 
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and Peptide Motifs, Landes Bioscience Austin, TX, 224-27, (1997); Parker et al., "Scheme 
for ranking potential HLA-A2 binding peptides based on independent binding of individual 
peptide side-chains," J. Immunol. 152:163-175 (1994); Kido & Ohshita, Anal. Biochem., 
230:41-47 (1995); Yamada et al, J. Biochem. (Tokyo), 95:1155-60 (1984); Kawashima et 
al., Kidney Int., 54:275-8 (1998); Nakabayshi & Ikezawa, Biochem. Int. 16:1119-25 (1988); 
Kanaseki & Ohkuma, J. Biochem. (Tokyo), 1 10:541-7 (1991); Wattiaux et al., J. Cell Biol, 
78:349-68 (1978); Lisman et al., Biochem. J., 178:79-87 (1979); Dean, B., Arch. Biochem. 
Biophys., 227:154-63 (1983); Overdijk et al., Adv. Exp. Med. Biol, 101:601-10 (1978); 
Stromhaug et al., Biochem. J., 335:217-24 (1998); Escola et al., J. Biol. Chem., 271:27360- 
05 (1996); Hammond et al., Am. J. Physiol, 267:F5 16-27 (1994); Williams & Smith, Arch. 
Biochem. Biophys., 305:298-306 (1993); Marsh, M., Methods Cell Biol, 31:319-34 (1989); 
Schmid & Mellman, Prog. Clin. Biol. Res., 270:35-49 (1988); Falk, K. et al., Nature, 
351:290, (1991); Ausubel et al., Short Protocols in Molecular Biology, Third Edition, Unit 
11.2 (1997); hypertext transfer protocol address 

1 34.2.96.22 1/scripts/hlaserver.dll/EpPredict.htm; Levy, Morel, S. et al., Immunity 12:107- 
117 (2000); Seipelt et al., "The structures of picornaviral proteinases," Virus Research, 
62:159-68, 1999; Storkus et al., U.S. Patent No. 5,989,565, issued November 23, 1999; 
Morton, U.S. Patent No. 5,993,828, issued November 30, 1999; Virus Research 62:159-168, 
(1999); Simard et al., U.S. Patent Application No. 10/026066, filed December 07, 2001; 
Simard et al., U.S. Patent Application No. 09/561571, filed April 28, 2000; Simard et al., 
U.S. Patent Application No. 09/561572, filed April 28, 2000; Miura et al., WO 99/01283, 
January 14, 1999; Simard et al., U.S. Patent Application No. 09/561074, filed April 28, 
2000; Simard et al., U.S. Patent Application No. 10/225568, filed August 20, 2002; Simard 
et al., U.S. Patent Application No. 10/005905, filed November 07, 2001; Simard et al., U.S. 
Patent Application No. 09/561074, filed April 28, 2000. 

[0122] Additional embodiments of the invention include methods, uses, therapies, 
and compositions relating to a particular antigen, whether the antigen is derived from, for 
example, a target cell or an infective agent, such as those mentioned above. Some preferred 
embodiments employ the antigens listed herein, in Tables 4-8, or in the list below, alone, as 
subsets, or in any combination. For example, some embodiments exclude use of one or more 
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of those antigens. Other embodiments may exclude any combination or all of those antigens. 
Several examples of such antigens include MelanA (MART-I), gplOO (Pmel 17), tyrosinase, 
TRP-1, TRP-2, MAGE-1, MAGE-3, BAGE, GAGE-1, GAGE-2, CEA, RAGE, NY-ESO, 
SCP-1, Hom/Mel-40, PRAME, p53, H-Ras, HER-2/neu, BCR-ABL, E2A-PRL, H4-RET, 
IGH-IGK, MYL-RAR, Epstein Barr virus antigens, EBNA, human papillomavirus (HPV) 
antigens E6 and E7, TSP-180, MAGE-4, MAGE-5, MAGE-6, pl85erbB2, pl80erbB-3, c- 
met, nm-23Hl, PSA, TAG-72-4, CAM 17.1, NuMa, K-ras, p-Catenin, CDK4, Mum-1, pl6, 
as well as any of those set forth in the above mentioned references. Other antigens are 
included in Tables 4-7 below. 

[0123] Further embodiments include methods, uses, compositions, and therapies 
relating to epitopes, including, for example those epitopes listed in Tables 4-8. These 
epitopes can be useful to flank housekeeping epitopes in screening vectors, for example. 
Some embodiments include one or more epitopes from Tables 4-8, while other embodiments 
specifically exclude one or more of such epitopes or combinations thereof. 



Table 4 



Virus 


Protein 


AA 

Position 


T cell epitope MHC 
ligand (Antigen) 


MHC molecule 


Adenovirus 3 


E3 9Kd 


30-38 


LIVIGELIL 


HLA-A*0201 








(SEQ. EDNO.:44) 




Adenovirus 5 


EIA 


234-243 


SGPSNTPPEI 


H2-Db 








(SEQ. IDNO.:45) 




Adenovirus 5 


E1B 


192-200 


VNIRNCCY1 


H2-Db 








(SEQ.IDNO.:46) 




Adenovirus 5 


EIA 


234-243 


SGPSMPPEI (T>I) 


H2-Db 








(SEQ.IDNO.:47) 




CSFV 


NS 

polyprotein 


2276-2284 


ENALLVALF 


SLA,haplotype d/d 








(SEQ.IDNO.:48) 




Dengue virus 4 


NS3 


500-508 


TPEGHPTL 


HLA-B*3501 








(SEQ.IDNO.:49) 




EBV 


LMP-2 


426-434 


CLGGLLTMV 


HLA-A*0201 








(SEQ.IDNO.:50) 




EBV 


EBNA-1 


480-484 


NIAEGLRAL 


HLA-A*0201 








(SEQ.IDNO.:51) 




EBV 


EBNA-1 


519-527 


NLRRGTALA 


HLA-A*0201 








(SEQ. ID N0..52) 




EBV 


EBNA-1 


525-533 


ALAJPQCRL 


HLA-A*0201 



-38- 









(SEQ. ID NO.:53) 




EBV 


EBNA-1 


575-582 


VLKDAKDL 


HLA-A*0201 








(SEQ. ID NO.: 54) 




EBV 


EBNA-1 


562-570 


FMVFLQTHI 


HLA-A*0201 








(SEQ. IDNO.:55) 




EBV 


EBNA-2 


15-23 


HLIVDTDSL 


HLA-A*0201 








(SEQ. ID NO.: 56) 




EBV 


EBNA-2 


22-30 


SLGNPSLSV 


HLA-A*0201 








(SEQ. IDNO.:57) 




EBV 


EBNA-2 


126-134 


PLASAMRML 


HLA-A*0201 








(SEQ. ID NO.:58) 




EBV 


EBNA-2 


132-140 


RMLWMANY1 


HLA-A*0201 








(SEQ. IDNO.:59) 




EBV 


EBNA-2 


133-141 


MLWMANYIV 


HL.A-A*0201 








(SEQ. ID NO.:60) 




EBV 


EBNA-2 


151-159 


ILPQGPQTA 


HLA-A*0201 








(SEQ. IDNO.:61) 




EBV 


EBNA-2 


171-179 


PLRPTAPTI 


HLA-A*0201 








(SEQ. ID NO.:62) 




EBV 


EBNA-2 


205-213 


PLPPATLTV 


HLA-A*0201 








(SEQ. ID NO.:63) 




EBV 


EBNA-2 


246-254 


RMHLPVLHV 


HLA-A*0201 








(SEQ.IDNO.:64) 




EBV 


EBNA-2 


287-295 


PMPLPPSQL 


HLA-A*0201 








(SEQ.IDNO.:65) 




EBV 


EBNA-2 


294-302 


QLPPPAAPA 


HLA-A*0201 








(SEQ. IDNO.:66) 




EBV 


EBNA-2 


381-389 


SMPELSPVL 


HLA-A*0201 








(SEQ. ID NO.:67) 




EBV 


EBNA-2 


453-461 


DLDESWDYI 


HLA-A*0201 








(SEQ.IDNO.:68) 




EBV 


BZLF1 


43-51 


PLPCVLWPV 


HLA-A*0201 








(SEQ.IDNO.:69) 




EBV 


BZLF1 


167-175 


SLEECDSEL 


HLA-A*0201 








(SEQ. ID NO.:70) 




EBV 


BZLF1 


176-184 


EDCRYKNRV 


HLA-A*0201 








(SEQ. IDNO.:71) 




EBV 


BZLF1 


195-203 


QLLQHYREV 


HLA-A*0201 








(SEQ. ID NO.:72) 




EBV 


BZLF1 


196-204 


T T /\TTf TT*T<I 7 A 

LLQHYREVA 


TTT A A 

HLA-A*0201 








(SEQ.IDNO.:73) 




EBV 


BZLFI 


217-225 


LLKQMCPSL 


HLA-A*020l 








(SEQ.IDNO.:74) 




EBV 


BZLF1 


229-237 


SDPRTPDV 


HLA-A*0201 



-39- 









(SEQ. ID NO.: 75) 




EBV 


EBNA-6 


284-293 


LLDFVRFMGV 


HLA-A*0201 








(SEQ. ID NO.: 76) 




EBV 


EBNA-3 


464-472 


SVRDRLARL 


HLA-A*0203 








(SEQ. ID NO.:77) 




EBV 


EBNA-4 


416-424 


IVTDFSVIK 


HLA-A*1101 








(SEQ. ID NO.:78) 




EBV 


EBNA-4 


399-408 


AVFDRKSDAK 


HLA-A*0201 








(SEQ. ID NO.: 79) 




EBV 


EBNA-3 


246-253 


RYSIFFDY 


HLA-A24 








(SEQ. ID NO.: 80) 




EBV 


EBNA-6 


881-889 


QPRAPIRPI 


HLA-B7 








(SEQ. IDNO.:81) 




EBV 


EBNA-3 


379-387 


RPPIFIRRI. 


HLA-B7 








(SEQ. ID NO.:82) 




EBV 


EBNA-1 


426-434 


EPDVPPGAI 


HLA-B7 








(SEQ. IDNO.:83) 




EBV 


EBNA-1 


228-236 


IPQCRLTPL 


HLA-B7 








(SEQ. ID NO.: 84) 




EBV 


EBNA-1 


546-554 


GPGPQPGPL 


HLA-B7 








(SEQ. ID NO.: 85) 




EBV 


EBNA-1 


550-558 


QPGPLRESI 


HLA-B7 








(SEQ. ID NO.: 86) 




EBV 


EBNA-1 


72-80 


R.PQKRPSCI 


HLA-B7 








(SEQ. ID NO: 87) 




EBV 


EBNA-2 


224-232 


PPTPLLTVL 


HLA-B7 








(SEQ. IDNO.:88) 




EBV 


EBNA-2 


241-249 


TPSPPRMHL 


HLA-B7 








(SEQ. IDNO.:89) 




EBV 


EBNA-2 


244-252 


PPRMHLPVL 


HLA-B7 








(SEQ. ID NO.:90) 




EBV 


EBNA-2 


254-262 


VPDQSMHPL 


HLA-B7 








(SEQ. IDNO.:91) 




EBV 


EBNA-2 


446-454 


PPSIDPADL 


HLA-B7 








(SEQ. IDNO.:92) 




EBV 


BZLFI 


44-52 


LPCVLWPVL 


HLA-B7 








(SEQ. IDNO.:93) 




EBV 


BZLF1 


222-231 


CPSLDVDSn 


HLA-B7 








(SEQ. IDNO.:94) 




EBV 


BZLFI 


234-242 


TPDVLHEDL 


HLA-B7 








(SEQ.IDNO.:95) 




EBV 


EBNA-3 


339-347 


FLRGRAYGL 


HLA-B8 








(SEQ. ID NO.:96) 




EBV 


EBNA-3 


26-34 


QAKWRLQTL 


HLA-B8 



-40- 









(SEQ. ID NO.:97) 




EBV 


EBNA-3 


325-333 


AYPLHEQHG 


HLA-B8 








(SEQ.IDNO.:98) 




EBV 


EBNA-3 


158-166 


Y1KSFVSDA 


HLA-B8 








(SEQ.IDNO.:99) 




EBV 


LMP-2 


236-244 


RRRWRRLTV 


HLA-B*2704 








(SEQ. ID NO.: 100) 




EBV 


EBNA-6 


258-266 


RRIYDLIEL 


HLA-B*2705 








(SEQ. ID NO.: 101) 




EBV 


EBNA-3 


458-466 


YPLHEQHGM 


HLA-B*3501 








(SEQ. ID NO.: 102) 




EBV 


EBNA-3 


458-466 


YPLHEQHGM 


HLA-B*3503 








(SEQ. ID NO.: 103) 




HCV 


NS3 


389-397 


HSKKKCDEL 


HLA-B8 








(SEQ. ID NO.: 104) 




HCV 


env E 


44-51 


ASRCWVAM 


HLA-B*3501 








(SEQ. ID NO.: 105) 




HCV 


core 
protein 


27-35 


GQIVGGVYL 


HLA-B*40012 








(SEQ. ID NO.: 106) 




HCV I 


NSI 


77-85 


PPLTDFDQGW 


HLA-B*5301 








(SEQ. ID NO.: 107) 




HCV 


core 
protein 


18-27 


LMGYIPLVGA 


H2-Dd 








(SEQ. ID NO.: 108) 




HCV 


core 
protein 


16-25 


ADLMGYIPLV 


H2-Dd 








(SEQ. ID NO.: 109) 




HCV 


NS5 


409-424 


MSYSWTGALVTPCAEE 


H2-Dd 








(SEQ. ID NO.: 110) 




HCV 


NSI 


205-213 


KHPDATYSR 


Papa-A06 








(SEQ. ID NO.: Ill) 




HCV-1 


NS3 


400-409 


KLVALGINAV 


HLA-A*0201 








(SEQ. ID NO.: 112) 




HCV-1 


NS3 


440-448 


GDFDSVIDC 


Patr-B16 








(SEQ. ID NO.: 113) 




HCV-1 


env E 


118-126 


GNASRCWVA 


Patr-BI6 








(SEQ. ID NO.: 114) 




HCV-1 


NSI 


159-167 


TRPPLGNWF 


Patr-B13 








(SEQ. ID NO.: 11 5) 




HCV-1 


NS3 


351-359 


VPHPNIEEV 


Patr-B13 








(SEQ. ID NO.: 116) 




HCV-1 


NS3 


438-446 


YTGDFDSVI 


Patr-BOl 








(SEQ. ID NO.: 11 7) 





-41- 



HCV-1 


NS4 


328-335 


SWAIKWEY 


Patr-Al 1 








(SEQ.E)NO.:118) 




HCV-1 


NSI 


205-213 


KHPDATYSR 


Patr-A04 








(SEQ.IDNO.:119) 




HCV-1 


NS3 


440-448 


GDFDSVDDC 


Patr-A04 








(SEQ.IDNO.:120) 




HIV 


gp41 


583-591 


RYLKDQQLL 


HLA A24 








(SEQ. IDNO.:121) 




HIV 


gagp24 


267-275 


rVGLNKTVR 


HLA-A*3302 








(SEQ. ID NO.: 122) 




HIV 


gagp24 


262-270 


EIYKRWDL 


HLA-B8 








(SEQ. ID NO.: 123) 




HIV 


gagp24 


261-269 


GE1YKRWI1 


HLA-B8 








(SEQ. ID NO.: 124) 




HIV 


gagpl7 


93-101 


EKDTKEAL 


HLA-B8 








(SEQ. ID NO.: 125) 




HIV 


gp41 


586-593 


YLKDQQLL 


HLA-B8 








(SEQ.IDNO.:126)_ 




HIV 


gagp24 


267-277 


ILGLNKTVRMY 


HLA-B* 1501 








(SEQ. ID NO.: 127) 




HIV 


gp41 


584-592 


ERYLKDQQL 


HLA-B14 








(SEQ. ID NO: 128) 




HIV 


nef 


115-125 


YHTQGYFPQWQ 


HLA-B 17 








(SEQ. ID NO.: 129) 




HIV 


nef 


117-128 


TQGYFPQWQNYT 


HLA-B 17 








(SEQ. ID NO.: 130) 




HIV 


gpl20 


314-322 


GRAFVT1GK 


HLA-B*2705 








(SEQ. ID NO.: 131) 




HIV 


gagp24 


263-271 


KRWULGLN 


HLA-B*2702 








(SEQ. ID NO: 132) 




HIV 


nef 


72-82 


QVPLRPMTYK 


HLA-B*3501 








(SEQ.IDNO.:133) 




HIV 


nef 


117-125 


TQGYFPQWQ 


HLA-B*3701 








(SEQ. ID NO: 134) 




HIV 


gagp24 


143-151 


HQAISPRTI, 


HLA-Cw*0301 








(SEQ.IDNO.:135) 




HIV 


gagp24 


140-151 


QMVHQAISPRTL 


HLA-Cw*0301 








(SEQ. ID NO: 136) 




HIV 


gpl20 


431-440 


MYAPPIGGQI 


H2-Kd 








(SEQ. ID NO.: 137) 




HIV 


gpl60 


318-327 


RGPGRAFVTI 


H2-Dd 








(SEQ. ID NO: 138) 




HIV 


gp!20 


17-29 


MPGRAFVTI 


H2-Ld 








(SEQ. ID NO: 139) 



-42- 



HIV-1 


RT 


476-484 


ELKEPVHGV 


HLA-A*0201 








(SEQ.IDNO.:140) 




HIV-1 


nef 


190-198 


AFHHVAREL 


HLA-A*0201 








(SEQ.IDNO.:141) 




HIV-1 


gpI60 


120-128 


KLTPLCVTL 


HLA-A*0201 








(SEQ. ID NO.: 142) 




HIV-1 


gp]60 


814-823 


SLLNATDIAV 


HLA-A*0201 








(SEQ. ID NO.: 143) 




HIV-1 


RT 


179-187 


VIYQYMDDL 


HLA-A*0201 








(SEQ. ID NO.: 144) 




HIV-1 


gagpl7 


77-85 


SLYNTVATL 


HLA-A*0201 








(SEQ. ID NO: 145) 




HIV-1 


gpl60 


315-329 


RGPGRAFVT1 


HLA-A*0201 








(SEQ. ID NO.: 146) 




HIV-1 


gp41 


768-778 


RLRDLLLIVTR 


HLA-A3 








(SEQ. ID NO.: 147) 




HIV-1 


nef 


73-82 


QVPLRPMTYK 


HLA-A3 








(SEQ. ID NO.: 148) 




HIV-1 


gpl20 


36-45 


TVYYGVPVWK 


HLA-A3 








(SEQ. ID NO.: 149) 




HIV-1 


gagpl7 


20-29 


RLRPGGKKK 


HLA-A3 








(SEQ. ID NO.: 150) 




HIV-1 


gpl20 


38-46 


VYYGVPVWK 


HLA-A3 








(SEQ. ID NO.: 151) 




HIV-1 


nef 


74-82 


VPLRPMTYK 


HLA-a*1101 








(SEQ. ID NO: 152) 




HIV-1 


gagp24 


325-333 


AIFQSSMTK 


HLA-A*1101 








(SEQ. ID NO.: 153) 




HIV-1 


nef 


73-82 


QVPLRPMTYK 


HLA-A*1101 








(SEQ. ID NO: 154) 




HIV-1 


nef 


83-94 


AAVDLSHFLKEK 


HLA-A*1101 








(SEQ. ID NO: 155) 




HIV-1 


gagp24 


349-359 


ACQGVGGPGGHK 


HLA-A*1101 








(SEQ. ID NO.: 156) 




HIV-1 


gagp24 


203-212 


ETINEEAAEW 


HLA-A25 








(SEQ. ID NO: 157) 




HIV-1 


nef 


128-137 


TPGPGVRYPL 


HLA-B7 








(SEQ. ID NO: 158) 




HIV-1 


gagp 17 


24-31 


GGKKKYKL 


HLA-B8 








(SEQ. ID NO: 159) 




HIV-1 


gpl20 


2-10 


RVKEKYQHL 


HLA-B8 








(SEQ. ID NO.: 160) 




HIV-1 


gagp24 


298-306 


DRFYKTLRA 


HLA-B 14 








(SEQ.IDNO.:161) 
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HIV-1 


NEF 


132-147 


GVRYPLTFGWCYKLVP 


HLA-B18 








(SEQ. ID NO.: 162) 




HIV-1 


gagp24 


265-24 


KRWHLGLNK 


HLA-B*2705 








(SEQ. ID NO.: 163) 




HIV-1 


nef 


190-198 


AFHHVAREL 


HLA-B*5201 








(SEQ. ID NO.: 164) 




EBV 


EBNA-6 


335-343 


KEHVIQNAF 


HLA-B44 








(SEQ. ID NO.: 165) 




EBV 


EBNA-6 


130-139 


EENLLDFVRF 


HLA-B*4403 








(SEQ. ID NO.: 166) 




EBV 


EBNA-2 


42-51 


DTPLIPLTIF 


HLA-B51 








(SEQ. ID NO.: 167) 




EBV 


EBNA-6 


213-222 


QNGALAINTF 


HLA-1362 








(SEQ. IDNO.:168)_ 




EBV 


EBNA-3 


603-611 


RLRAEAGVK 


HLA-A3 








(SEQ. ID NO.: 169) 




HBV 


sAg 


348-357 


GLSPTVWLSV 


HLA-A*0201 








(SEQ. ID NO.: 170) 




HBV 


SAg 


335-343 


WLSLLVPFV 


HLA-A*0201 








(SEQ. ID NO.: 171) 




HBV 


cAg 


18-27 


FLPSDFFPSV 


HLA-A*0201 j 








(SEQ. ID NO.: 172) 




HBV 


cAg 


18-27 


FLPSDFFPSV 


HLA-A*0202 








(SEQ. ID NO.: 173) 




HBV 


cAg 


18-27 


FLPSDFFPSV 


HLA-A*0205 








(SEQ. ID NO.: 174) 




HBV 


cAg 


18-27 


FLPSDFFPSV 


HLA-A*0206 








(SEQ. ID NO.: 175) 




HBV 


pol 


575-583 


FLLSLGIHL 


HLA-A*0201 








(SEQ. ID NO: 176) 




HBV 


pol 


816-824 


SLYADSPSV 


HLA-A*0201 








(SEQ. ID NO.: 177) 




HBV 


pol 


455-463 


GLSRYVARL 


HLA-A*0201 








(SEQ. ID NO.: 178) 




HBV 


env 


338-347 


LLVPFVQWFV 


HLA-A*0201 








(SEQ. ID NO.: 179) 




HBV 


pol 


642-650 


ALMPLYACI 


HLA-A*0201 








(SEQ. ID NO.: 180) 




HBV 


env 


378-387 


LLPIFFCLWV 


HLA-A*0201 








(SEQ. ID NO.: 181) 




HBV 


pol 


538-546 


YMDDWLGA 


HLA-A*0201 








(SEQ. ID NO.: 182) 




HBV 


env 


250-258 


LLLCLIFLL 


HLA-A*0201 








(SEQ. ID NO.: 183) 





-44- 



HBV 


env 


260-269 


LLDYQGMLPV 


HLA-A*0201 








(SEQ. ID NO.: 184) 




HBV 


env 


370-379 


SIVSPFEPLL 


HLA-A*0201 








(SEQ. ID NO.: 185) 




HBV 


env 


183-191 


FLLTRILTI 


HLA-A*0201 








(SEQ. ID NO.: 186) 




HBV 


cAg 


88-96 


YVNVNMGLK 


HLA-A* 1 101 








(SEQ. ID NO.: 187) 




HBV 


cAg 


141-151 


STLPETTWRR 


HLA-A*3101 








(SEQ. ID NO.: 188) 




HBV 


cAg 


141-151 


STLPETTWRR 


HLA-A*6801 








(SEQ. ID NO.: 189) 




HBV 


cAg 


18-27 


FLPSDFFPSV 


HLA-A*6801 








(SEQ. ID NO: 190) 




HBV 


sAg 


28-39 


IPQSLDSWWTSL 


H2-Ld 








(SEQ. ID NO.: 191) 




HBV 


cAg 


93-100 


MGLKFRQL 


H2-Kb 








(SEQ. ID NO.: 192) 




HBV 


preS 


141-149 


STBXQSGXQ 


HLA-A*0201 








(SEQ. ID NO: 193) 




HCMV 


gpB 


618-628 


FIAGNSAYEYV 


HLA-A*0201 








(SEQ. ID NO.: 194) 




HCMV 


El 


978-989 


SDEEFAIVAYTL 


HLA-B18 








(SEQ. ID NO.: 195) 




HCMV 


pp65 


397-411 


DDVWTSGSDSDEELV 


HLA-b35 








(SEQ. ID NO.: 196) 




HCMV 


pp65 


123-131 


IPSINVHHY 


HLA-B*3501 








(SEQ. ID NO: 197) 




HCMV 


pp65 


495-504 


NLVPMVATVO 


HLA-A*0201 








(SEQ. ID NO.: 198) 




HCMV 


pp65 


415-429 


RKTPRVTOGGAMAGA 


HLA-B7 








(SEQ. ID NO.: 199) 




HCV 


MP 


17-25 


DLMGYIPLV 


HLA-A*0201 








(SEQ.K>NO.:200) 




HCV 


MP 


63-72 


LLALLSCLTV 


HLA-A*0201 








(SEQ.IDNO.:201) 




HCV 


MP 


105-112 


ILHTPGCV 


HLA-A*0201 








(SEQ. ID NO.:202) 




HCV 


env E 


66-75 


QLRRHIDLLV 


HLA-A*0201 








(SEQ. IDNO.:203) 




HCV 


envE 


88-96 


DLCGSVFLV 


HLA-A*0201 








(SEQ. ID NO.:204) 




HCV 


env E 


172-180 


SMVGNWAKV 


HLA-A*0201 








(SEQ. ID NO.:205) 
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HCV 


NSI 


308-316 


HLHQNTVDV 


HLA-A*0201 








(SEQ. IDNO.:206) 




HCV 


NSI 


340-348 


FLLLADARV 


HLA-A*0201 








(SEQ. IDNO.:207) 




HCV 


NS2 


234-246 


GLRDLAVAVEPW 


HLA-A*0201 








(SEQ.IDNO.:208) 




HCV 


NSI 


18-28 


SLLAPGAKQNV 


HLA-A*0201 








(SEQ.IDNO.:209) 




HCV 


NSI 


19-28 


LLAPGAKQNV 


HLA-A*0201 








(SEQ.IDNO.:210) 




HCV 


NS4 


192-201 


LLFMLGGWV 


HLA-A*0201 








(SEQ.IDNO.:211) 




HCV 


NS3 


579-587 


YLVAYQATV 


HLA-A*0201 








(SEQ.IDNO.:212) 




HCV 


core 
protein 


34-43 


YLLPRRGPRL 


HLA-A*0201 








(SEQ.IDNO.:213) 




HCV 


MP 


63-72 j 


LLALLSCLTI 


HLA-A*0201 








(SEQ. ID NO. :2 14) 




HCV 


NS4 


174-182 


SLMAFTAAV 


HLA-A*0201 








(SEQ. ID NO. :2 15) 




HCV 


NS3 


67-75 


C1NGVCWTV 


HLA-A*0201 








(SEQ.IDNO.:216) 




HCV 


NS3 


163-171 


LLCPAGHAV 


HLA-A*0201 








(SEQ. ID NO. :2 17) 




HCV 


NS5 


239-247 


BLDSFDPLV 


HLA-A*0201 








(SEQ. ID NO. :2 18) 




HCV 


NS4A 


236-244 


ILAGYGAGV 


HLA-A*0201 








(SEQ. ID NO. :2 19) 




HCV 


NS5 


714-722 


GLQDCTMLV 


HLA-A*0201 








(SEQ. ID NO.:220) 




HCV 


NS3 


281-290 


TGAPVTYSTY 


HLA-A*0201 








(SEQ. ID NO.:221) 




HCV 


NS4A 


149-157 


HMWNFISGI 


HLA-A*0201 








(SEQ.IDNO.:222) 




HCV 


NS5 


575-583 


RVCEKMALY 


HLA-A*0201-A3 








(SEQ. ID NO.:223) 




HCV 


NSI 


238-246 


TINYTIFK 


HLA-A*1101 








(SEQ.IDNO.:224) 




HCV 


NS2 


109-116 


YISWCLWW 


HLA-A23 








(SEQ.IDNO.:225) 




HCV 


core 
protein 


40-48 


GPRLGVRAT 


HLA-B7 








(SEQ. ID NO.:226) 





-46- 



HIV-1 


gpl20 


380-388 


SFNCGGEFF 


HLA-Cw*0401 








(SEQ. IDNO.:227) 




HIV-1 


RT 


206-214 


TEMEKEGKI 


H2-Kk 








(SEQ. IDNO.:228) 




HIV-1 


pl7 


18-26 


KIRLRPGGK 


HLA-A*0301 








(SEQ.IDNO.:229) 




HIV-1 


P17 


20-29 ! 


RLRPGGKKKY 


HLA-A*0301 








(SEQ.IDNO.:230) 




HIV- 1 


RT 


325-333 


AIFQSSMTK 


HLA-A*0301 








(SEQ.IDNO.:231) 




HIV-1 


pl7 


84-92 


TLYCVHQRI 


HLA-A11 








(SEQ. ID NO.:232) 




HIV-1 


RT 


508-517 


IYQEPFKNLK 


HLA-A11 








(SEQ. ID NO.:233) 




HIV-1 


P 17 


28-36 


KYKLKHIVW 


HLA-A24 








(SEQ. ID NO.:234) 




HIV-1 


gpl20 


53-62 


LFCASDAKAY 


HLA-A24 








(SEQ. IDNO.:235) 




HIV-1 


gagp24 


145-155 


QAISPRTLNAW 


HLA-A25 








(SEQ. ID NO.:236) 




HIV-1 


gagp24 


167-175 


EVIPMFSAL 


HLA-A26 








(SEQ.IDNO.:237) 




HIV-1 


RT 


593-603 


ETFYVDGAANR 


HLA-A26 








(SEQ.IDNO.:238) 




HIV-1 


gp41 


775-785 


RLRDLLLIVTR 


HLA-A31 








(SEQ. ID NO.:239) 




HIV-1 


RT 


559-568 


PIQKETWETW 


HLA-A32 








(SEQ.IDNO.:240) 




HIV-1 


gpl20 


419-427 


RIKQHNMW 


HLA-A32 








(SEQ.IDNO.:241) 




HIV-1 


RT 


71-79 


ITLWQRPLV 


HLA-A*6802 








(SEQ. ID NO.:242) 




HIV-1 


RT 


85-93 


DTVLEEMNL 


HLA-A*6802 








(SEQ. ID NO.:243) 




HIV-1 


RT 


71-79 


ITLWQRPLV 


HLA-A*7401 








(SEQ.IDNO.:244) 




HIV-1 


gag p24 


148-156 


SPRTLNAWV 


HLA-B7 








(SEQ. IDNO.:245) 




HIV-1 


gagp24 


179-187 


ATPQDLNTM 


HLA-B7 








(SEQ. ID NO.:246) 




HIV-1 


gpl20 


303-312 


RPNNNTRKSI 


HLA-B7 








(SEQ. ID NO.:247) 




HIV-1 


gp41 


843-851 


IPRRIRQGL 


HLA-B7 








(SEQ.IDNO.:248) | 



-47- 



HIV-1 


pl7 


74-82 


ELRSLYNTV 


HLA-B8 








(SEQ. LDNO.:249) 




HIV-1 


nef 


13-20 


WPTVRERM 


HLA-B8 








(SEQ. IDNO.:250) 




HIV-1 


nef 


90-97 


FLKEKGGL 


HLA-B8 








(SEQ. IDNO.:251) 




HIV-1 


gagp24 


183-191 


DLNTMLNTV 


HLA-B14 








(SEQ. ID NO.:252) 




HIV-1 


P17 


18-27 


KJRLRPGGKK 


HLA-B27 








(SEQ. IDNO.:253) 




HIV-1 


pl7 


19-27 


ERLRPGGKK 


HLA-B27 








(SEQ.IDNO.:254) 




HIV-1 


gp41 


791-799 


GRRGWEALKY 


HLA-B27 








(SEQ.IDNO.:255) 




HIV-1 


nef 


73-82 


QVPLRPMTYK 


HLA-B27 








(SEQ. ID NO. :256) 




HW-1 


GP41 


590-597 


RYLKDQQL 


HLA-B27 








(SEQ. IDNO.:257) 




HIV-1 


nef 


105-114 


RRQDDLDLWI 


HLA-B*2705 








(SEQ.IDNO.:258) 




HIV-1 


nef 


134-141 


RYPLTFGW 


HLA-B*2705 








(SEQ. ID NO.:259) 




HIV-1 


P 17 


36-44 


WASRELERF 


HLA-B35 








(SEQ. ID NO.:260) 




HIV-1 


GAG P24 


262-270 


TVLDVGDAY 


HLA-B35 








(SEQ. ED NO.:261) 




HIV-1 


gpl20 


42-52 


VPVWKEATTTL 


HLA-B35 








(SEQ. DDNO.:262) 




HIV-1 


P17 


36-44 


NSSKVSQNY 


HLA-B35 








(SEQ. EDNO.:263) 




HIV-1 


gag p24 


254-262 


PPIPVGDIY 


HLA-B35 








(SEQ. ID NO.:264) 




HIV-1 


RT 


342-350 


HPDIVIYQY 


HLA-B35 








(SEQ.EDNO.:265) 




HIV-1 


gp41 


611-619 


TAVPWNASW 


HLA-B35 








(SEQ. ID NO.:266) 




HIV-1 




245-253 


NPVPVGN1Y 


HLA-B35 








(SEQ.IDNO.:267) 




HIV-1 


nef 


120-128 


YFPDWQNYT 


HLA-B37 








(SEQ. IDNO.:268) 




HIV-1 


gag p24 


193-201 


GHQAAMQML 


HLA-B42 








(SEQ.DDNO.:269) 




HIV-1 


P 17 


20-29 


RLRPGGKKKY 


HLA-B42 








(SEQ.EDNO.:270) 





-48- 



HIV-1 


RT 


438-446 


YPGDCVRQL 


HLA-B42 








(SEQ.IDNO.:271) 




HIV-1 


RT 


591-600 


GAETFYVDGA 


HLA-B45 








(SEQ. H)NO.:272) 




HIV-1 


gagp24 


325-333 


NANPDCKTI 


HLA-B51 








(SEQ.IDNO.:273) 




HIV-1 


gag p24 


275-282 


RMYSPTSI 


HLA-B52 








(SEQ. IDNO.:274) 




HIV-1 


gpl20 


42-51 


VPVWKEATTT 


HLA-B*5501 








(SEQ.EDNO.:275) 




HIV-1 


gag p24 


147-155 


ISPRTLNAW 


HLA-B57 








(SEQ.IDNO.:276) 




HIV-1 


gag p24 


240-249 


TSTLQEQIGW 


HLA-B57 








(SEQ. ID NO. :277) 




HIV-1 


gag p24 


162-172 


KAFSPEVDPMF 


HLA-B57 








(SEQ. IDNO.:278) 




HIV-1 


gag p24 


311-319 


QASQEVKNW 


HLA-B57 








(SEQ. IDNO.:279) i 




HIV-1 


gag p24 


311-319 


QASQDVKNW 


HLA-B57 








(SEQ. IDNO.:280) 




HIV-1 


nef 


116-125 


HTQGYFPDWQ 


HLA-B57 








(SEQ. IDNO.:281) 




HIV-1 


nef 


120-128 


YFPDWQNYT 


HLA-B57 








(SEQ.EDNO.:282) 




HIV-1 


gag p24 


240-249 


TSTLQEQIGW 


HLA-B58 








(SEQ. IDNO.:283) 




HIV-1 


pl7 


20-29 


RLRPGGKKKY 


HLA-B62 








(SEQ.E)NO.:284) 




HIV-1 


p24 


268-277 


LGLNKJVRMY 


HLA-B62 








(SEQ.E)NO.:285) 




HIV-1 


RT 


415-426 


LVGKLNWASQIY 


HLA-B62 








(SEQ.IDNO.:286) 




HIV-1 


RT 


476-485 


ILKEPVHGVY 


HLA-B62 








(SEQ.IDNO.:287) 




HIV-1 


nef 


117-127 


TQGYFPDWQNY 


HLA-B62 








(SEQ.IDNO.:288) 




HIV-1 


nef 


84-91 


AVDLSHFL 


HLA-B62 








(SEQ.IDNO.:289) 




HIV-1 


gag p24 


168-175 


VIPMFSAL 


HLA-Cw*0102 








(SEQ. ID NO.:290) 




HIV-1 


gpl20 


376-384 


FNCGGEFFY 


HLA-A29 








(SEQ.IDNO.:291) 




HIV-1 


gpl20 


375-383 


SFNCGGEFF 


HLA-B15 








(SEQ. ID NO.:292) 
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HIV-1 


nef 


136-145 


PLTFGWCYKL 


HLA-A*0201 








(SEQ.IDNO.:293) 




HIV-1 


nef 


180-189 


VLEWRFDSRL 


HLA-A*0201 








(SEQ.IDNO.:294) 




HIV-1 


nef 


68-77 


FPVTPQVPLR 


HLA-B7 








(SEQ.IDNO.:295) 




HIV-1 


nef 


128-137 


TPGPGVRYPL 


HLA-B7 








(SEQ.IDNO.:296) 




HIV-1 


gag p24 


308-316 


QASQEVKNW 


HLA-Cw*0401 








(SEQ.IDNO.:297) 




HIV-1 mB 


RT 


273-282 


VPLDEDFRKY 


HLA-B35 








(SEQ. ID NO. :298) 




HIV-1 IIIB 


RT 


25-33 


NPDIVIYQY 


HLA-B35 








(SEQ. BDNO.:299) 




HIV-1 mB 


gp41 


557-565 


RAffiAQAHL 


HLA-B51 








(SEQ.IDNO.:300) 




HIV-1 mB 


RT 


231-238 ; 


TAFTIPSI 


HLA-B51 








(SEQ. ID NO.:301) 




HIV-imB 


p24 


215-223 


VHPVHAGPIA 


HLA-B*5501 








(SEQ.IDNO.:302) 




HIV-1 mB 


gpl20 


156-165 


NCSFNISTSI 


HLA-Cw8 








(SEQ. IDNO.:303) 




HIV-imB 


gpl20 


241-249 


CTNVSTVQC 


HLA-Cw8 








(SEQ. IDNO.:304) 




HIV-1 5F2 


gpl20 


312-320 


IGPGRAFHT 


H2-Dd 








(SEQ.IDNO.:305) 




HIV-1 5F2 


pol 


25-33 


NPDIVIYQY 


HLA-B*3501 








(SEQ. IDNO.:306) 




HIV-15F2 


pol 


432-441 


EPIVGAETFY 


HLA-B*3501 








(SEQ. IDNO.:307) 




HIV-1 5F2 


pol 


432-440 


EPIVGAETF 


HLA-B*3501 








(SEQ.IDNO.:308) 




HIV-1 5F2 


pol 


6-14 


SPAIFQSSM 


HLA-B*3501 








(SEQ.IDNO.:309) 




HIV-1 5F2 


pol 


59-68 


VPLDKDFRKY 


HLA-B*3501 








(SEQ. IDNO.:310) 




HIV-1 5F2 


pol 


6-14 


IPLTEEAEL 


HLA-B*3501 








(SEQ. ID NO. :3 11) 




HIV-1 5F2 


nef 


69-79 


RPQVPLRPMTY 


HLA-B*3501 








(SEQ. IDNO.:312) 




HIV-1 5F2 


nef 


66-74 


FPVRPQVPL 


HLA-B*3501 








(SEQ.IDNO.:313) 




HIV-1 5F2 


env 


10-18 


DPNPQEWL 


HLA-B*3501 








(SEQ. IDNO.:314) 
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HIV-1 5F2 


env 


7-15 


RPTVSTQLL 


HLA-B*3501 








(SEQ.H)NO.:315) 




HIV-1 5F2 


pol 


6-14 


IPLTEEAEL 


HLA-B51 








(SEQ. ID NO.:316) 




HIV-1 5F2 


env 


10-18 


DPNPQEVVL 


HLA-B51 








(SEQ. ID NO. :3 17) 




HIV-1 5F2 


gagp24 


199-207 


AMQMLKETI 


H2-Kd 








(SEQ. ID NO. :3 18) 




HIV-2 


gagp24 


182-190 


TPYDrNQML 


HLA-B*5301 








(SEQ. IDNO.:319) 




HTV-2 




260-269 


RRWIQLGLQKV 


HLA-B*2703 








(SEQ.IDNO.:320) 




HIV-1 5F2 


gp41 


593-607 


GIWGCSGKLICTTAV 


HLA-B17 








(SEQ. ID NO.:321) 




HIV-1 5F2 


gp41 


753-767 


ALIWEDLRSLCLFSY 


HLA-B22 








(SEQ.IDNO.:322) 




HPV 6b 


E7 


21-30 


GLHCYEQLV 


HLA-A*0201 








(SEQ. ID NO.:323) 




HPV6b 


E7 


47-55 


PLKQHFQIV 


HLA-A*0201 








(SEQ.IDNO.:324) 




HPV11 


E7 


4-12 


RLVTLKDIV 


HLA-A*0201 








(SEQ. ID NO.: 325) 




HPV16 


E7 


86-94 


TLGIVCPIC 


HLA-A*0201 








(SEQ.IDNO.:326) 




HPV16 


E7 


85-93 


GTLGIVCPI 


HLA-A*0201 








(SEQ. ID NO.:327) 




HPV16 


E7 


12-20 


MLDLQPETT 


HLA-A*0201 








(SEQ. ID NO.:328) 




HPV 16 


E7 


11-20 


YMLDLQPETT 


HLA-A*0201 








(SEQ.IDNO.:329) 




HPV16 


E6 


15-22 


RPRKLPQL 


HLA-B7 








(SEQ.IDNO.:330) 




HPV16 


E6 


49-57 


RAHYNTVTF 


HW-Db 








(SEQ.IDNO.:331) 




HSV 




498-505 


SSIEFARL 


H2-Kb 








(SEQ.IDNO.:332) 




HSV-1 


gp c 


480-488 


GIGIGVLAA 


HLA-A*0201 








(SEQ. IDNO.:333) 




HSV-1 


ICP27 


448-456 


DYATLGVGV 


H2-Kd 








(SEQ.DDNO.:334) 














HSV-1 


ICP27 


322-332 


LYRTFAGNPRA 


H2-Kd 








(SEQ. ID NO. :335) 





HSV-1 


UL39 


822-829 


QTFDFGRL 


H2-Kb 








(SEQ.IDNO.:336) 




HSV-2 


gpC 


446-454 


GAGIGVAVL 


HLA-A*0201 








(SEQ.EDNO.:337) 




HLTV-1 


TAX 


11-19 


LLFGYPVYV 


HLA-A*0201 








(SEQ. ID NO.:338) 




Influenza 


MP 


58-66 


GELGFVFTL 


HLA-A*0201 








(SEQ.IDNO.:339) 




Influenza 


MP 


59-68 


ELGFVFTLTV 


HLA-A*0201 








(SEQ.IDNO.:340) 




Influenza 


NP 


265-273 


ILRGSVAHK 


HLA-A3 








(SEQ. ID NO.:341) 




Influenza 


NP 


91-99 


KTGGPIYKR 


HLA-A*6801 








(SEQ.IDNO.:342) 




Influenza 


NP 


380-388 


ELRSRYWAI 


HLA-B8 








(SEQ. ED NO.:343) 




Influenza 


NP 


381-388 


LRSRYWAI 


HLA-B*2702 








(SEQ. EDNO.:344) 




Influenza 


NP 


339-347 


EDLRVLSFI 


HLA-B*3701 








(SEQ.IDNO.:345) 




Influenza 


NSI 


158-166 


GEISPLPSL 


HLA-B44 








(SEQ.IDNO.:346) 




Influenza 


NP 


338-346 


FEDLRVLSF 


HLA-B44 








(SEQ.EDNO.:347) 




Influenza 


NSI 


158-166 


GEISPLPSL 


HLA-B*4402 








(SEQ.IDNO.:348) 




Influenza 


NP 


338-346 


FEDLRVLSF 


HLA-B*4402 








(SEQ.IDNO.:349) 




Influenza 


PBI 


591-599 


VSDGGPKLY 


HLA-A1 








(SEQ.IDNO.:350) 




Influenza A 


NP 


44-52 


CTELKLSDY 


HLA-A1 








(SEQ.IDNO.:351) 




Influenza 


NSI 


122-130 


AIMOKNIIL 


HLA-A*0201 








(SEQ.IDNO.:352) 




Influenza A 


NSI 


123-132 


EMDKNIILKA 


HLA-A*0201 








(SEQ. ED NO.:353) 




Influenza A 


NP 


383-391 


SRYWAERTR 


HLA-B*2705 








(SEQ. ED NO.:354) 




Influenza A 


NP 


147-155 


TYQRTRALV 


H2-Kd 








(SEQ. ED N0..355) 




Influenza A 


HA 


210-219 


TYVSVSTSTL 


H2-Kd 








(SEQ.EDNO.:356) 




Influenza A 


HA 


518-526 


IYSTVASSL 


H2-Kd 








(SEQ.EDNO.:357) 
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Influenza A 


HA 


259-266 


FEANGNLI 


H2-Kk 








(SEQ. ID NO.:358) 




Influenza A 


HA 


10-18 


IEGGWTGM1 


H2-Kk 








(SEQ. IDNO.:359) 




Influenza A 


NP 


50-57 


SDYEGRLI 


H2-Kk 








(SEQ.EDNO.:360) 




Influenza a 


NSI 


152-160 


EEGATVGEI 


H2-Kk 








(SEQ.IDNO.:361) 




Influenza A34 


NP 


336-374 


ASNENMETM 


H2Db 








(SEQ. ID NO.:362) 




Influenza A68 


NP 


366-374 


ASNENMDAM 


H2Db 








(SEQ. ID NO.:363) 




Influenza B 


NP 


85-94 


KLGEFYNQMM 


HLA-A*0201 








(SEQ.IDNO.:364) 




Influenza B 


NP 


85-94 


KAGEFYNQMM 


HLA-A*0201 








(SEQ. ID NO.:365) 




Influenza JAP 


HA 


204-212 


LYQNVGTYV 


H2Kd 








(SEQ.IDNO.:366) 




Influenza JAP 


HA 


210-219 


TYVSVGTSTL 


H2-Kd 








(SEQ.IDNO.:367) 




Influenza JAP 


HA 


523-531 


VYQILATYA 


H2-Kd 








(SEQ.IDNO.:368) 




Influenza JAP 


HA 


529-537 


IYATVAGSL 


H2-Kd 








(SEQ.IDNO.:369) 




Influenza JAP 


HA 


210-219 


TYVSVGTSTI(L>I) 


H2-Kd 








(SEQ. ED NO.:370) 




Influenza JAP 


HA 


255-262 


FESTGNLI 


H2-Kk 








(SEQ.IDNO.:371) 




JHMV 


cAg 


318-326 


APTAGAFFF 


H2-Ld 








(SEQ.IDNO.:372) 




LCMV 


NP 


118-126 


RPQASGVYM 


H2-Ld 








(SEQ.IDNO.:373) 




LCMV 


NP 


396-404 


FQPQNGQFI 


H2-Db 








(SEQ. IDNO.:374) 




LCMV 


GP 


276-286 


SGVENPGGYCL 


H2-Db 








(SEQ. ID NO.:375) 




LCMV 


GP 


33-42 


KAVYNFATCG 


H2-Db 








(SEQ. ID NO.:376) 




MCMV 


pp89 


168-176 


YPHFMPTNL 


H2-Ld 








(SEQ. EDNO.:377) 




MHV 


spike 
protein 


510-518 


CLSWNGPHL 


H2-Db 








(SEQ.IDNO.:378) 




MMTV 


env gp 36 


474-482 


SFAVATTAL 


H2-Kd 
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(SEQ. ID NO.:379) 




MMTV 


gag p27 


425-433 


SYETFISRL 


H2-Kd 








(SEQ.IDNO.:380) 




MMTV 


env gp73 


544-551 


ANYDFICV 


H2-Kb 








(SEQ.IDNO.:381) 




MuLV 


envplSE 


574-581 


KSPWFTTL 


H2-Kb 








(SEQ.E)NO.:382) 




MuLV 


env gp70 


189-196 


SSWDFITV 


H2-Kb 








(SEQ. ID NO.:383) 




MuLV 


gag 75K 


75-83 


CCLCLTVFL j 


H2-Db 








(SEQ.IDNO.:384) 




MuLV 


env gp70 


423-431 ! 


SPSYVYHQF | 


H2Ld 








(SEQ.IDNO.:385) 




MV 


F protein 


437-447 


SRRYPDAVYLH 


HLA-B*2705 








(SEQ. ID NO.:386) 




Mv 


F protein 


438-446 


RRYPDAVYL 


HLA-B*2705 


















(SEQ. ID NO.:387) 




Mv 


NP 


281-289 


YPALGLHEF 


H2-Ld 








(SEQ. IDNO.:388) 




Mv 


HA 


343-351 


DPVIDRLYL 


H2-Ld 








(SEQ. ID NO. :3 89) 




MV 


HA 


544-552 


SPGRSFSYF 


H2-Ld 








(SEQ. IDNO.:390) 




Poliovirus 


VP1 


111-118 


TYKDTVQL 


H2-kd 








(SEQ.IDNO.:391) 




Poliovirus 


VP1 


208-217 


FYDGFSKVPL 


H2-Kd 








(SEQ.IDNO.:392) 




Pseudorabies 
virus gp 


Gill 


455-463 


IAGIGILAI 


HLA-A*0201 








(SEQ. ID NO.:393) 




Rabiesvirus 


NS 


197-205 


VEAEIAHQI 


H2-Kk 








(SEQ. IDNO.:394) 




Rotavirus 


VP 7 


33-40 


11YRFLL1 


H2-Kb 








(SEQ. ID NO.:395) 




Rotavirus 


VP6 


376-384 


VGPVFPPGM 


H2-Kb 








(SEQ.IDNO.:396) 




Rotavirus 


VP3 


585-593 


YSGYIFRDL 


H2-Kb 








(SEQ. ID NO.:397) 




RSV 


M2 


82-90 


SYIGSINNI 


H2-Kd 








(SEQ.IDNO.:398) 




SIV 


gagpllC 


179-190 


EGCTPYDTNQML 


Mamu-A*01 








(SEQ. ID NO.:399) 
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sv 


NP 


324-332 


FAPGNYPAL 


H2-Db 








(SEQ. K>NO.:400) 




sv 


NP 


324-332 


FAPCTNYPAL 


H2-Kb 








(SEQ.IDNO.:401) 




SV40 


T 


404-411 


WYDFLKC 


H2-Kb 








(SEQ.IDNO.:402) 




SV40 


T 


206-215 


SAINNYAQKL 


H2-Db 








(SEQ. ID NO.:403)_ 




SV40 


T 


223-231 


CKGVNKEYL 


H2-Db 








(SEQ. ID NO.:404) 




SV40 


T 


489-497 


QGINNLDNL 


H2-Db 








(SEQ. ID NO.:405) 




SV40 


T 


492-500 
(501) 


NNLDNLRDY(L) 


H2-Db 








(SEQ. ID NO.:406) 




SV40 


T 


560-568 


SEFLLEKRI 


H2-Kk 








(SEQ.IDNO.:407) 




vsv 


NP 


52-59 


RGYVYQGL 


H2-Kb 








(SEQ. ID NO.:408) 





Table 5 



HLA-A1 


Position (Antigen) 


Source 


T cell epitopes 


EADPTGHSY 


MAGE-1 161-169 




(SEQ. ID NO.:409) 






VSDGGPNLY 


Influenza A PB 1591-599 




(SEQ. ID NO.:410) 






CTELKLSDY 


Influenza A NP 44-52 




(SEQ.IDNO.:411) 






EVDPIGHLY 


MAGE-3 168-176 




(SEQ.IDNO.:412) 




HLA-A201 


MLLSVPLLLG 


Calreticulin signal sequence I- 10 




(SEQ, ID NO. :4 13) 






STBXQSGXQ 


HBV PRE-S PROTEIN 141-149 




(SEQ. ID NO. :4 14) 






YMDGTMSQV 


Tyrosinase 369-377 




(SEQ. ID NO. :4 15) 






ILKEPVHGV 


HTV- 1 RT 476-484 




(SEQ. ID NO.:416) 






LLGFVFTLTV 


Influenza MP 59-68 




(SEQ. ID NO. :4 17) 






LLFGYPVYW 


HTLV-1 tax 11-19 




(SEQ.DDNO.:418) 
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GLSPTVWLSV 


HBV sAg 348-357 




(SEQ. ED NO. :4 19) 






WLSLLVPFV 


HBV sAg 335-343 




(SEQ.IDNO.:420) 






FLPSDFFPSV 


HBV cAg 18-27 




(SEQ.IDNO.:421) 






CLGOLLTMV 


EBV LMP-2 426-434 




(SEQ.IDNO.:422) 






FLAGNSAYEYV 


HCMVgp618-628B 




(SEQ. ID NO.:423) 






KLGEFYNQMM 


Influenza BNP 85-94 




(SEQ. ID NO.:424) 






KLVALGINAV 


HCV-1 NS3 400^109 




(SEQ. ID NO.:425) 






DLMGYIPLV 


HCV MP 17-25 




(SEQ.IDNO.:426) 






RLVTLKDIV 


HPV 11 EZ4-12 




(SEQ.IDNO.:427) 






MLLAVLYCL 


Tyrosinase 1-9 




(SEQ.IDNO.:428) 






AAGIGILTV 


Melan A\Mart-127-35 




(SEQ.IDNO.:429) 






YLEPGPVTA 


Pmel 17/gp 100 480-488 




(SEQ.IDNO.:430) 






ILDGTATLRL 


Pmel 17/gp 100 457-466 




(SEQ. IDNO.:431) 






LLDGTATLRL 


Pmel gplOO 457-466 




(SEQ. ID NO.:432) 






ITDQVPFSV 


Pmel gp 100 209-217 




(SEQ. ID NO.:433) 






KTWGQYWQV 


Pmel gp 100 154-162 




(SEQ.IDNO.:434) 






TITDQVPFSV 


Pmel gp 100 208-217 




(SEQ.IDNO.:435) 






AFHITVAREL 


HIV- 1 nef 190-198 




(SEQ. IDNO.:436) 






YLNKIQNSL 


P. falciparum CSP 334-342 




(SEQ. ID NO.:437) 






MMRKLAILSV 


P. falciparum CSP 1-10 




(SEQ. IDNO.:438) 






KAGEFYNQMM 


Influenza BNP 85-94 




(SEQ. IDNO.:439) 







N1AEGLRAL 


EBNA-1 480-488 




(SEQ. ID NO.:440) 






NLRRGTALA 


EBNA-1 519-527 




(SEQ. ID NO.:441) 






ALAIPQCRL 


EBNA-1 525-533 




(SEQ.IDNO.:442) 






VLKDADKDL 


EBNA-1 575-582 




(SEQ. ID NO.:443) 






FMVFLQTHI 


EBNA-1 562-570 




(SEQ. ID NO.:444) 






HLIVDTDSL 


EBNA-2 15-23 




(SEQ. ID NO.:445) 






SLGNPSLSV 


EBNA-2 22-30 




(SEQ. ID NO.:446) 






PLASAMRML 


EBNA-2 126-134 




(SEQ. ID NO.:447) 






RMLWMANYI 


EBNA-2 132-140 




(SEQ. ID NO.:448) 






MLWMANYIV 


EBNA-2 133-141 




(SEQ. ID NO.:449) 






ILPQGPQTA 


EBNA-2 151-159 




(SEQ. ID NO.:450) 






PLRPTAPTTI 


EBNA-2 171-179 




(SEQ.IDNO.:451) 






PLPPATLTV 


EBNA-2 205-213 




(SEQ. IDNO.:452) 






RMHLP VLHV 


EBNA-2 246-254 




(SEQ. ID NO.:453) 






PMPLPPSQL 


EBNA-2 287-295 




(SEQ. IDNO.:454) 






QLPPPAAPA 


EBNA-2 294-302 




(SEQ.IDNO.:455) 






SMPELSPVL 


EBNA-2 381-389 




(SEQ. ID NO.:456) 






DLDESWDY1 


EBNA-2 453-461 




(SEQ. ID NO.:457) 






PLPCVLWPVV 


BZLF1 43-51 




(SEQ. IDNO.:458) 






SLEECDSEL 


BZLF1 167-175 




(SEQ, ID NO.:459) 






EIKRYKNRV 


BZLFI 176-184 




(SEQ. ID NO.:460) 
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QLLQFIYREV 


BZLF1 195-203 




(SEQ.IDNO.:461) 






LLQHYREVA 


BZLFI 196-204 




(SEQ. ID NO.:462) 






LLKQMCPSL 


BZLFI 217-225 




(SEQ. ID NO.:463) 






SIIPRTPDV 


BZLFI 229-237 




(SEQ. ID NO.:464) 






AIMDKNIIL 


Influenza ANSI 122-130 




(SEQ. ID NO.:465) 






IMDKNIILKA 


Influenza ANSI 123-132 




(SEQ.E)NO.:466) 






LLALLSCLTV 


HCV MP 63-72 




(SEQ.IDNO.:467) 






ILHTPGCV 


HCVMP 105-112 




(SEQ. E)NO.:468) 






QLRRHIDLLV 


HCV env E 66-75 




(SEQ, ID NO.:469) 






DLCGSVFLV 


HCV env E 88-96 




(SEQ.IDNO.:470) 






SMVGNWAKV 


HCV env E 172-180 




(SEQ.E)NO.:471) 






HLHQNIVDV 


HCVNSI 308-316 




(SEQ. ID NO.:472) 






FLLLADARV 


HCV NSI 340-348 




(SEQ. IDNO.:473) 






GLRDLAVAVEPVV 


HCV NS2 234-246 




(SEQ. IDNO.:474) 






SLLAPGAKQNV 


HCVNSI 18-28 




(SEQ.IDNO.:475) 






LLAPGAKQNV 


HCVNSI 19-28 




(SEQ.IDNO.:476) 






FLLSLGIHL 


HBV pol 575-583 




(SEQ. ID NO.:477) 






SLYADSPSV 


HBV pol 816-824 




(SEQ.IDNO.:478) 






GLSRYVARL 


HBV POL 455-463 




(SEQ. ID NO.:479) 






KIFGSLAFL 


HER-2 369-377 




(SEQ. ID NO.:480) 






ELVSEFSRM 


HER-2 971-979 




(SEQ.IDNO.:481) 
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KLTPLCVTL 


HIV- Igp 160 120-128 




(SEQ. ED NO.:482) 






SLLNATDIAV 


HTV-IGP 160 814-823 




(SEQ. ID NO.:483) 






VLYRYGSFSV 


Pmel gplOO 476-485 




(SEQ. ID NO.:484) 






YIGEVLVSV 


Non-filament forming class I myosin 
family (HA-2)** 




(SEQ. ID NO.:485) 






LLFNILGGWV 


HCVNS4 192-201 




(SEQ. ID NO.:486) 






LLVPFVQWFW 


HBV env 338-347 




(SEQ. ID NO.:487) j 






ALMPLYACI 


HBV pol 642-650 




(SEQ. IDNO.:488) 






YLVAYQATV 


HCV NS3 579-587 




(SEQ.IDNO.:489) 






TLGIVCPIC 


HIPV 16 E7 86-94 




(SEQ. IDNO.:490) 






YLLPRRGPRL 


HCV core protein 34-43 




(SEQ.IDNO.:491) 






LLPIFFCLWV 


HBV env 378-387 




(SEQ. ID NO.:492) 






YMDDWLGA 


HBV Pol 538-546 




(SEQ.DDNO.:493) 






GTLGIVCPI 


HPV16E7 85-93 




(SEQ. ID NO.:494) 






LLALLSCLTI 


HCV MP 63-72 




(SEQ. ID NO.:495) 






MLDLQPETT 


HPV 16 E7 12-20 




(SEQ. ED NO.:496) 






SLMAFTAAV 


HCVNS4 174-182 




(SEQ. ID NO.:497) 






CINGVCWTV 


HCV NS3 67-75 




(SEQ. ID NO.:498) 






VMNILLQYVV 


Glutamic acid decarboxylase 1 14-123 




(SEQ. ID NO.:499) 






ILTVILGVL 


Melan A/Mart- 32-40 




(SEQ. ID NO.:500) 






FLWGPRALV 


MAGE-3 271-279 




(SEQ. ID NO.:501) 






LLCP AGHAV 


HCVNS3 163-171 
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(SEQ.IDNO.:502) 






ILDSFDPLV 


HCV NSS 239-247 




(SEQ. ID NO.:503) 






LLLCLIFLL 


HBV env 250-258 




(SEQ.IDNO.:504) 






LIDYQGMLPV 


HBV env 260-269 




(SEQ.IDNO.:505) 






SIVSPFIPLL 


HBV env 370-379 




(SEQ. ID NO.:506) 






FLLTRILTI 


HBV env 183-191 




(SEQ. ID NO.:507) 






HLGNVKYLV 


P. faciparum TRAP 3-1 1 




(SEQ.IDNO.:508) 






GIAGGLALL 


P. faciparum TRAP 500-508 




(SEQ. DDNO.:509) 






ILAGYGAGV 


HCV NS S4A 236-244 




(SEQ. ED NO. :5 10) 






GLQDCTMLV 


HCV NS5 714-722 




(SEQ.IDNO.:511) 






TGAPVTYSTY 


HCVNS3 281-290 




(SEQ.IDNO.:512) 






VIYQYMDDLV 


HIV-1RT 179-187 




(SEQ. ID NO. :5 13) 






VLPDVFIRCV 


N-acetylglucosaminyltransferase V Gnt-V 
intron 




(SEQ. ID NO. :5 14) 






VLPDVFIRC 


N-acetylglucosaminyltransferase V Gnt-V 
intron 




(SEQ.IDNO.:515) 






AVGIGIAVV 


Human CD9 




(SEQ.IDNO.:516) 






LWLGLLAV 


Human glutamyltransferase 




(SEQ. ED NO. :5 17) 






ALGLGLLPV 


Human G protein coupled receptor 




(SEQ. ID NO. :5 18) 
164-172 






GIGIGVLAA 


HSV- 1 gp C 480-488 




(SEQ.IDNO.:519) 






GAGIGVAVL 


HSV-2 gp C 446-454 




(SEQ. ID NO.:520) 






IAGIGELAI 


Pseudorabies gpGIN 455-463 




(SEQ.IDNO.:521) 






LIVIGILIL 


Adenovirus 3 E3 9kD 30-38 
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(SEQ. ID NO.:522) 






LAGIGL1AA 


S. Lincolnensis ImrA 




(SEQ. ED NO.:523) 






VDGIGELTI 


Yeast ysa-1 77-85 




(SEQ. ID NO.:524) 






GAGIGVLTA 


B. polymyxa, (3cndoxylanase 149- 157 




(SEQ.IDNO.:525) 
157 






AAGIGHQI 


E. coli methionine synthase 590-598 




(SEQ. ID NO.:526) 






QAGIGILLA 


E. coli hypothetical protein 4-12 




(SEQ. IDNO.:527) 






KARDPHSGHFV 


CDK4wl 22.32 




(SEQ. IDNO.:528) 






KACDPI-ISGHFV 


CDK4-R24C 22-32 




(SEQ. EDNO.:529) 






ACDPFISGHFV 


CDK4-R24C 23-32 




(SEQ. IDNO.:530) 






SLYNTVATL 


HIV- 1 gag p 17 77-85 




(SEQ. IDNO.:531) 






ELVSEFSRV 


HER-2, m>V substituted 971-979 




(SEQ. ID NO.:532) 






RGPGRAFVTI 


HIV-Igp 160 315-329 




(SEQ.IDNO.:533) 






HMWNFISGI 


HCV NS4A 149-157 




(SEQ. IDNO.:534) 






NLVPMVATVQ 


HCMV pp65 495-504 




(SEQ. IDNO.:535) 






GLHCYEQLV 


HPV 6b E7 21-30 




(SEQ. ED NO.:536) 






PLKQHFQEV 


HPV 6b E7 47-55 ! 




(SEQ. EDNO.:537) 






LLDFVRFMGV 


EBNA-6 284-293 




(SEQ.EDNO.:538) 






AEMEKNEML 


Influenza Alaska NS 1 122-130 




(SEQ.EDNO.:539) 






YLKTIQNSL 


P. falciparum cp36 CSP 




(SEQ. EDNO.:540) 






YLNKIQNSL 


P. falciparum cp39 CSP 




(SEQ.EDNO.:541) 






YMLDLQPETT 


HPV 16 E7 11-20* 




(SEQ.EDNO.:542) 







LLMGTLGIV 


HPV16 E7 82-90** 




(SEQ.EDNO.:543) 






TLGIVCPI 


HPV 16 E7 86-93 




(SEQ. ID NO.:544) 






TLTSCNTSV 


HIV-1 gpl20 197-205 




(SEQ. ID NO.:545) 






KLPQLCTEL 


HPV 16 E6 18-26 




(SEQ. ID NO.:546) 






TIHDHLEC 


HPV16E6 29-37 




(SEQ. IDNO.:547) 






LGIVCPICS 


HPV 16 E7 87-95 




(SEQ. ID NO.: 548) 






VILGVLLLI 


Melan A/Mart-1 35-43 




(SEQ. ID NO.: 549) 






ALMDKSLHV 


Melan A/Mart- 1 56-64 




(SEQ. ID NO.:550) 






GILTVILGV 


Melan A/Mart- 1 31-39 




(SEQ.IDNO.:551) 




T cell epitopes 


MINAYLDKL 


P. Falciparum STARP 523-531 




(SEQ.IDNO.:552) 






AAGIGILTV 


Melan A/Mart- 127-35 




(SEQ.IDNO.:553) 






FLPSDFFPSV 


HBV cAg 18-27 




(SEQ.IDNO.:554) 




Motif unknown 


SVRDRLARL 


EBNA-3 464-472 


T cell epitopes 


(SEQ.IDNO.:555) 




T cell epitopes 


AAGIGILTV 


Melan A/Mart-1 27-35 




(SEQ. IDNO.:556) 






FAYDGKDYI 


Human MHCI-ot 140-148 




(SEQ. IDNO.:557) 




T cell epitopes 


AAGIGILTV 


Melan A/Mart-1 27-35 




(SEQ. ID NO.:558) 






FLPSDFFPSV 


HBV cAg 18-27 




(SEQ.IDNO.:559) 




Motif unknown 


AAGIGILTV 


Meland A/Mart-1 27-35 


T cell epitopes 


(SEQ.IDNO.:560) 






FLPSDFFPSV 


HBV cAg 18-27 




(SEQ.IDNO.:561) 






AAGIGILTV 


Melan A/Mart-1 27-35 




(SEQ. ID NO.:562) 






ALLAVGATK 


Pmell7 gp 100 17-25 




(SEQ.IDNO.:563) 





T cell epitopes 


RLRDLLLIVTR 


HIV-1 gp41 768-778 




(SEQ. ED NO.:564) 






QVPLRPMTYK 


HIV-1 nef 73-82 




(SEQ.E)NO.:565) 






TVYYGVPVWK 


HIV-1 gpl20-36-45 




(SEQ.EDNO.:566) 






RLRPGGKKK 


HIV- 1 gag p 17 20-29 




(SEQ.E)NO.:567) 






ELRGSVAHK 


Influenza NP 265-273 




(SEQ.IDNO.:568) 






RLRAEAGVK 


EBNA-3 603-611 




(SEQ. ID NO.:569) 






RLRDLLLIVTR 


HTV-1 gp41 770-780 




(SEQ. ID NO.:570) 






VYYGVPVWK 


HrV-IGP 120 38-46 




(SEQ.EDNO.:571) 






RVCEKMALY 


HCVNS5 575-583 




(SEQ.IDNO.:572) 




Motif unknown 


KIFSEVTLK 


Unknown; muta melanoma peptide ted (p 
I 83L) 175-183 


T cell epitope 


(SEQ. ID NO.:573) 






YVNVNMGLK* 


HBV cAg 88-96 




(SEQ. ID NO.:574) 




T cell epitopes 


rvTDFSvnc 


EBNA-4 416-424 




(SEQ. ID NO.:575) 






ELNEALELK 


P53 343-351 




(SEQ.IDNO.:576) 






VPLRPMTYK 


HIV- 1 NEF 74-82 




(SEQ.IDNO.:577) 






ADFQSSMTK 


HIV- 1 gag p24 325-333 




(SEQ.IDNO.:578) 






QVPLRPMTYK 


HTV-1 nef 73-82 




(SEQ. IDNO.:579) 






TENYTEFK HCV 


NSI 238-246 




(SEQ. ID NO.:580) 






AAVDLSHFLKEK 


HIV-1 nef 83-94 




(SEQ.IDNO.:581) 






ACQ GVGGPGGHK 


HIV-1 II IB p24 349-359 




(SEQ. ED NO.:582) 




HLA-A24 


S YLDSGIHF* 


P-catenin, mutated (proto-onocogen) 
29-37 




(SEQ.EDNO.:583) 




T cell epitopes 


RYLKDQQLL 


HIV GP 41 583-591 
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(SEQ. DDNO.:584) 






AYGLDFYIL 


P15 melanoma Ag 10- 18 




(SEQ. ID NO.:585) 






AFLPWHRLFL 


Tyrosinase 206-215 




(SEQ. ID NO.:586) 






AFLPWHRLF 


Tyrosinase 206-214 




(SEQ. ID NO.:587) 






RYSIFFDY 


Ebna-3 246-253 




(SEQ. ID NO.:588) 




T cell epitope 


ETENEEAAEW 


HIV- 1 gagp24 203-212 




(SEQ. ID NO.:589) 




T cell epitopes 


STLPETTWRR 


HBV cAg 141 -151 




(SEQ. ID NO.:590) 






MSLQRQFLR 


ORF 3P-gp75 294-321 (bp) 




(SEQ. ID NO.:591) 






LLPGGRPYR 


TRP (tyrosinase rel.) 197-205 




(SEQ. ID NO.:592) 




T cell epitope 


rVGLNKIVR 


HIV gag p24 267-267-275 




(SEQ. ID NO.:593) 






AAGIGILTV 


Melan A/Mart- 127 35 




(SEQ.IDNO.:594) 





[0124] Table 6 sets forth additional antigens useful in the invention that are 
available from the Ludwig Cancer Institute. The Table refers to patents in which the 
identified antigens can be found and as such are incorporated herein by reference. TRA refers 
to the tumor-related antigen and the LUD No. refers to the Ludwig Institute number. 



Table 6 



TRA 


LUD 
No. 


Patent No. 


Date Patent Issued 


Peptide (Antigen) 


HLA 


MAGE-4 


5293 


5,405,940 


11 April 1995 


EVDPASNTY 


HLA-A1 










(SEQ. ID NO.:979) 




MAGE-41 


5293 


5,405,940 


11 April 1995 


EVDPTSNTY 


HLA-A I 










(SEQ ID NO:595) 




MAGE-5 


5293 


5,405,940 


11 April 1995 


EADPTSNTY 


HLA-A I 










(SEQ ID NO:596) 




MAGE-51 


5293 


5,405,940 


11 April 1995 


EADPTSNTY 


HLA-A I 










(SEQ ID NO:597) 
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MAGE-6 


5294 


5,405,940 


11 April 1995 


EVDPIGHVY 


HLA-A1 










(SEQ ID NO:598) 






5299.2 


5,487,974 


30 January 1996 


MLLAVLYCLL 


HLA-A2 










(SEQ ID NO:599) 






5360 


5,530,096 


25 June 1996 


MLLAVLYCL 


HLA-B44 










(SEQ ID NO:600) 




Tyrosinase 


5360.1 


5,519,117 


21 May 1996 


SEIWRDIDFA 


HLA-B44 










(SEQ ID NO:601) 












SEIWRDIDF 












(SEQ ID NO:602) 




Tyrosinase 


5431 


5,774,316 


28 April 1998 


XEIWRDIDF 


HLA-B44 










(SEQ ID NO:603) 




MAGE-2 


5340 


5,554,724 


10 September 1996 


STLVEVTLGEV 


HLA-A2 










(SEQ ID NO:604) 












LVEVTLGEV 












(SEQ ID NO:605) 












VIFSKASEYL 












(SEQ ID NO:606) 












IIVLAIIA1 












(SEQ ID NO:607) 












KIWEELSMLEV 












(SEQ ID NO:608) 












LIETSYVKV 












(SEQ ID NO:609) 






5327 


5,585,461 


17 December 1996 


FLWGPRALV 


HLA-A2 










(SEQ ID NO: 610) 












TLVEVTLGEV 












(SEQ IDNO:611) 












ALVETSYVKV 












(SEQ ID NO:612) 




MAGE-3 


5344 


5,554,506 


10 September 1996 


KIWEELSVL 


HLA-A2 










(SEQ IDNO:613) 




MAGE-3 


5393 


5,405,940 


11 April 1995 


EVDPIGHLY 


HLA-A1 










(SEQ IDNO:614) 




MAGE 


5293 


5,405,940 


11 April 1995 


EXDX5Y 


HLA-A1 










(SEQ. ID NO.:615) 












(but not EADPTGHSY) 
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(SEQ. ID NO.:616) 












E (A/V) D X5 Y 












(SEQ. ID NO.:617) 












E (AAO D P X4 Y 












(SEQ. ID NO.:618) 












E (A/V) D P (I/A/T) X3 Y 












(SEQ. ID NO. :6 19) 












E (AAO D P (I/A/T) (G/S) X2 Y 












(SEQ. ID NO/.620) 












E (AAO D P (I/A/T) (G/S) (H/N) X Y 












(SEQ. IDNO.:621) 












E (A/V) DP (I/A/T) (G/S) (H/N) 
(L/T/V) Y 












(SEQ. ll)NO.:622) 




MAGE-1 


5361 


5,558.995 


24 September 1996 


ELHSAYGEPRKLLTQD 


HLA-C 










(SEQ ID NO:623) 


Clone 10 










EHSAYGEPRKLL 












(SEQIDNO:624) 












SAYGEPRKL 












(SEQ ID NO:625) 




MAGE-1 


5253.4 


TBA 


TBA 


EADPTGHSY 


HLA-A I 










(SEQ ID NO:626) 




BAGE 


5310.1 


TBA 


TBA 


MAARAVFLALSAQLLQARLMKE 


HLA-C 










(SEQ ID NO:627) 


Clone 10 










MAARAVFLALSAQLLQ 


HLA-C 










(SEQ ID NO:628) 


Clone 10 










AARAVFLAL 


HLA-C 










(SEQ ID NO:629) 


Clone 10 


GAGE 


5323.2 


5,648,226 


15 July 1997 


YRPRPRRY 


HLA-CW6 










(SEQ. ID NO.:630) 
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Table 7 



Source 


Protein 


AA 
Position 


MHC 
molecules 


T cell epitope 
MHC ligand 
(Antigen) 


SEQ. 
ID 
NO.: 


Ref. 


synthetic 
peptides 


synthetic 
peptides 


synthetic 
peptides 


HLA-A2 


ALFAAAAAV 


631 


Parker, et al., "Scheme 
for ranking potential 
HLA-A2 binding 
peptides based on 
independent binding of 
individual peptide side- 
chains," J. Immunol. 

1 ^0*1 A^ 1 7< 
1 jZ. IOj-1 / J 








CC 


Lrlr LjLjVLiOV 


OjZ 










CC 


OLLJKLrLrLr V 


A^l 










CC 


vjLr LrLrr LrLr V 


0j4 










CC 


{jLr LrLrvjALr V 


Am 


u 








CC 


KjLr LrvjLriiLr V 


A1A 
OjO 


u 








CC 


KjLr LrLrLrr Lr V 


A17 


4< 








CC 


KjLr LrvjLrLrvjr.L 


Ojo 


M 








CC 


KjL,r vjOOOvj V 




u 








CC 


LrLr LrLrLr Vuv 


04U 










CC 


\jLr vjkj V Kjkj V 












CC 


{jLr ULr V LrJv V 


A/10 
04Z 










CC 


KjLr KLr V LrLr V 












CC 


GLGGGGr G V 


A/1/1 
o44 










CC 


GLLGGG V Lr V 


A/i c 
o4j 










CC 


KjL Y GGGUG V 


A/1 A 
O4o 










CC 


uJVLr GGGGG V 


A/n 
04/ 










CC 


GJVLr LrLr V GG V 


A/1 0 

o4o 










CC 


GCjtGGVGGV 


A/1Q 

04y 










CC 


GVrGGVGGV 












cc 


KLFGGGGGV 


651 










CC 


JvLr uu V Ovj V 


A<,9 
OjZ 










CC 


AILGFVFTL 


653 










CC 


GAIGFVFTL 


654 










CC 


GALGFVFTL 


655 










CC 


GELGFVFTL 


656 










CC 


GIAGFVFTL 


657 










CC 


GIEGFVFTL 


658 










CC 


GELAFVFTL 


659 
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GELGAVFTL 


660 












GELGEVFTL 


661 












GILFGAFTL 


662 












GILGFEFTL 


663 












GILGFKFTL 


664 












GILGFVATL 


665 












GELGFVETL 


666 












GILGFVFAL 


667 












GILGFVFEL 


668 












•WT y"^"¥*^"« TT1T7T 

GILGFVFKL 


669 












GELGFVFTA 


670 












GILGFVFTL 


671 












GILGFVFVL 


672 












GILGFVKTL 


673 












GELGKVFTL 


674 












GILKFVFTL 


675 












GILPFVFTL 


676 












GIVGFVFTL 


677 












GKLGFVFTL 


678 












GLLGFVFTL 


679 












GQLGFVFTL 


680 












KALGFVFTL 


681 












KILGFVFTL 


682 












KELGKVFTL 


683 












AILLGVFML 


684 












AIYKRWIIL 


685 












ALFFFDDDL 


686 












ATVELLSEL 


687 












CLFGYPVYV 


688 












FEFPNYTIV 


689 












HSLWDSQL 


690 












ILASLFAAV 


691 












ILESLFAAV 


692 












KLGEFFNQM 


693 












KLGEFYNQM 


694 












LLFGYPVYV 


695 












LLWKGEGAV 


696 












LMFuYPVYV 


697 












LNFGYPVYV 


698 












LQFGYPVYV 


699 












NIVAHTFKV 


700 












NLPMVATV 


701 













QMLLAIARL 


702 












QMWQARLTV 


703 












RLLQTGIHV 


704 












RLVNGSLAL 


705 












SLYNTVATL 


706 












TLNAWVKVV 


707 












WLYRETCNL 


708 












YLFKRMEDL 


709 












GAFGGVGGV 


710 












GAFGGVGGY 


711 












GEFGGVGGV 


712 












GGFGGVGGV 


713 












GEFGGGGGV 


714 












GIGGFGGGL 


715 












GIGGGGGGL 


716 












GLDGGGGGV 


717 












GLDGKGGGV 


718 












GLDKKGGGV 


719 












GLFGGGFGF 


720 












GLFGGGFGG 


721 












GLFGGGFGN 


722 












GLFGGGFGS 


723 












GLFGGGGGI 


724 












GLFGGGGGM 


725 












GLFGGGGGT 


726 












GLFGGGGGY 


727 












GLGFGGGGV 


728 












GLGGFGGGV 


729 












GLGGGFGGV 


730 












GLGGGGGFV 


731 












GLGGGGGGY 


732 












GLGGGVGGV 


733 












GLLGGGGGV 


734 












GLPGGGGGV 


735 












GNFGGVGGV 


736 












GSFGGVGGV 


737 












GTFGGVGGV 


738 












AGNSAYEYV 


739 












GLFPGQFAY 


740 












HILLGVFML 


741 












ILESLFRAV 


742 












KKKYKLKHI 


743 













MLASIDLKY 


744 












MLERELVRK 


745 












KLFGFVFTV 


746 












ILDKKVEKV 


747 










CC 


ILKEPVHGV 


748 










cc 


ALFAAAAAY 


749 










cc 


GIGFGGGGL 


750 












GKFGGVGGV 


751 












GLFGGGGGK 


752 










cc 


EELGFVFTL 


753 












GDCGFVFTL 


754 










cc 


GQLGFVFTK 


755 










CC 


ILGFVFTLT 


756 










cc 


KILGFVFTK 


757 










cc 


KKLGFVFTL 


758 










cc 


KLFEKVYNY 


759 










cc 


LRFGYPVYV 


760 




Human 


HSP60 


140-148 


HLA-B27 


IRRGVMLAV : 


761 


Rammensee et al. 1997 
160 




cc 


369-377 




KRIQEHEQ 


762 




cc 


cc 


469-477 


cc 


KRTLKIPAM 


763 


cc 


Yersinia 


HSP60 


35-43 


cc 


GRNWLDKS 


764 


cc 


«< 


(( 


117-125 


cc 


KRGIDKAVI 


765 


cc 


CC 


CC 


420-428 


cc 


IRAASAITA 


766 


cc 


CC 


HSP 60 


284-292 


HLA- 
B*2705 


RRKAMFEDI 


767 


169 


p. 

falciparum 


LSA-1 


1850- 
1857 


HLA- 
B3501 


KPKDELDY 


768 


170 


Influenza 
NP 




379-387 


HLA- 
B*4402 


LELRSRYWA 


769 


183 




Tum-P35B 


4-13 


HLA-D" 


GPPHSNNFGY 


770 


230 


Rotavirus 


VP7 


33-40 




HYRFLLI 


771 


262 




OGDH 
(F108Y) 


104-112 


H2-L d 


QLSPYPFDL 


772 


253 




TRP-2 


181-188 


p287 


VYDFFVWL 


773 


284 




DEAD box 
p68 


547-554 


p287 


SNFVFAGI 


774 


283 




Vector 
"artefact" 




p287 


SWEFSSL 


775 


260 




Epitope 
mimic of 




p287 


AHYLFRNL 


776 


278 
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tumor Ag 


















« 


THYLFRNL 


777 


cc 




Epitope 
mimic of 
H-3 

miHAg" 




cc 


LIViYNTL 


778 


279 










LIYEFNTL 


779 


a 








u 


IPYIYNTL 


780 


cc 








cc 


HYTYHPvL 


781 










cc 


LIYIFNTL 


782 


" 




HBV cAg 


93-100 


cc 


MGLKFRQL 


783 


280 


Human 


autoantigen 
LA 


51-58 


cc 


IMIKFRNRL 


784 


281 


Mouse 


UTY 
protein 




H2D b 


WMHHNMDLI 


785 


303 


Mouse 


p53 


232-240 


cc 


KYMCNSSCM 


786 


302 


MURINE 


MDM2 


441-449 


cc 


GRPKNGCIV 


787 


277 




Epitope 
mimic of 
natural 




cc 


AQHPNAELL 


788 


278 




MuLV 
gag75K 


75-83 


cc 


CCLCLTVFL 


789 


301 


P. 

Falciparum 


CSP 


375-383 


p290 


YENDIEKK 


790 


315 






371-379 




DELDYENDI 


791 


! 315 


HIV 


-1RT 


206-214 


cc 


TEMEKEGKI 


792 


316 


Rabies 


NS 


197-205 




VEAEIAHQI 


793 


309, 310 


Influenza A 


NS1 


152-160 


(C 


EEGATVGEI 


794 


304 


Murine 


SMCY 




p291 


TENSGKDI 


795 


317 




MHC class 
1 leader 


3-11 


p293 


AMAPRTLLL 


796 


318 




ND1 alpha 


1-12 


p293 


FFINILTLLVP 


797 


323 




NDBeta 


1-12 


p293 


FFINILTLLVP 


798 


323 




ND alpha 


1-17 


cc 


FFINILTLLVPI 
LIAM 


799 


324 




NDBeta 


1-17 


cc 


FFINALTLLVPI 
LIAM 


800 


cc 




COI 

mitochondr 
ial 


1-6 


cc 


FINRW 


801 


325 


L. 

monocyto- 


LemA 


1-6 


cc 


IGWH 


802 


326 
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genes 
















SIV gag 
pllC 


179-190 


Mamu- 
A*01 


EGCTPYDINQ 
ML 


803 


334 


















MAGE- 3 ; 




HLA-A2 


ALSRKVAEL 


804 


5,554,506 








cc 


IMPKAGLLI 


805 


cc 








cc 


KIWEELSVL 


806 


cc 








cc 


ALVETSYVKV 


807 


cc 








cc 


ThrLeuValGluV 
alThrLeuGlyGlu 
Val 


808 


cc 








cc 


AlaLeuSerArgLy 
sValAlaGluLeu 


809 


cc 








cc 


IleMetProLysAl 
aGlyLeuLeuIle 


810 


cc 








cc 


LysIleTrpGluGl 
uLeuSerValLeu 


811 


cc 








cc 


AlaLeuValGluT 
hrSerTyrValLys 
Val 


812 


cc 


















peptides 
which bind 
toMHCs 




HLA-A2 


Lys Gly He Leu 
Gly Phe Val Phe 
ThrLeuThrVal 


813 


5,989,565 








cc 


Gly He He Gly 
Phe Val Phe Tin- 
He 


814 


cc 








cc 


Gly He He Gly 
Phe Val Phe Tin- 
Leu 


815 


cc 








cc 


Gly He Leu Gly 
Phe Val Phe Thr 
Leu 


816 


cc 








cc 


Gly Leu Leu Gly 
Phe Val Phe Thr 
Leu 


817 


cc 








cc 


XXTVXXGVX, 
X=LeuorHe (6- 
37) 


818 


cc 








cc 


He Leu Thr Val 
He Leu Gly Val 
Leu 


819 


cc 
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cc 


Tyr Leu Glu Pro 
Gly Pro Val Thr 
Ala 


820 


cc 








cc 


Gin Val Pro Leu 
Arg Pro Met Thr 
Tyr Lys 


821 


cc 








cc 


Asp Gly Leu Ala 
Pro Pro Gin His 
Leu He Arg 


822 


cc 








cc 


Leu Leu Gly Arg 
Asn Ser Phe Glu 
Val 


823 


cc 


















Peptides 

from 

MAGE-1 




HLA-C 
clone 10 


GluHisSerAlaTy 
rGlyGluProArgL 
ysLeuLeuThrGln 
AspLeu 


824 


5,558,995 








cc 


GluHisSerAlaTy 
rGlyGluProArgL 
ysLeuLeu 


825 


cc 








cc 


SerAlaTyrGlyGl 
uProArgLysLeu 


826 


cc 


















GAGE 




HLA-Cw6 


TyrArgProArgPr 
oArgArgTyr 


827 


5,648,226 








cc 


ThrTyrArgProAr 
gProArgArgTyr 


828 


cc 








cc 


TyrArgProArgPr 
oArgArgTyrVal 


829 


cc 








cc 


ThrTyrArgProAr 

gProArgArgTyr 

Val 


830 


cc 








cc 


ArgProArgProAr 
gArgTyrValGlu 


831 


cc 








cc 


MetSerTrpArgG 
lyArgSerThrTyr 
ArgProArgProAr 
gArg 


832 


cc 








cc 


ThrTyrArgProAr 
gProArgArgTyr 
ValGluProProGl 
uMetlle 


833 


cc 




MAGE 




HLA-A1, 


Isolated 


834 


5,405,940 
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primarily 


nonapeptide 
having Glu at its 
N terminal, Tyr 
at its C-terminal, 
and Asp at the 
third residue 
from its N 
terminal, with 
the proviso that 
said isolated 
nonapeptide is 
not Glu Ala Asp 
Pro Thr Gly His 
Ser Tyr (SEQ ID 
NO: 1), and 
wherein said 
isolated 
nonapeptide 
binds to a human 
leukocyte 
antigen molecule 
on a cell to form 
a complex, said 
complex 

provoking lysis 
of said cell by a 
cytolytic T cell 
specific to said 
complex 














GluValValProIle 
SerHisLeuTyr 


835 












GluValValArgll 
eGlyHisLeuTyr 


836 












GluValAspProIl 
eGlyHisLeuTyr 


837 












GluValAspProA 
laSerAsnThrTyr 


838 












GluValAspProT 
hrSerAsnThrTyr 


839 












GluAlaAspProT 
hrSerAsnThrTyr 


840 












GluValAspProIl 


841 
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eGlyHisValTyr 














GAAGTGGTCC 
CCATCAGCCA 
CTTGTAC 


842 


cc 










GAAGTGGTCC 
GCATCGGCCA 
CTTGTAC 


843 


cc 










GAAGTGGAC 

CCCATCGGCC 

ACTTGTAC 


844 


cc 










GAAGTGGAC 

CCCGCCAGCA 

ACACCTAC 


845 


« i 








< 6 


GAAGTGGAC 

CCCACCAGCA 

ACACCTAC 


846 


cc 








< t 


GAAGCGGAC 

CCCACCAGCA 

ACACCTAC 


847 


cc 










GAAGCGGAC 

CCCACCAGCA 

ACACCTAC 


848 


cc 










GAAGTGGAC 

CCCATCGGCC 

ACGTGTAC 


849 


cc 








» 


GluAlaAspProT 
hrGlyHisSer 


850 


cc 










AlaAspProTrpGl 
yHisSerTyr 


851 


cc 




MAGE 
peptides 




HLA-A2 


SerThrLeuValGl 

uValThrLeuGly 

GluVal 


852 


5,554,724 




cc 




a 


LeuValGluValT 
hrLeuGlyGluVal 


853 


cc 




cc 




cc 


LysMetValGluL 
euValHisPheLeu 


854 


cc 




cc 




cc 


ValllePheSerLys 
AlaSerGluTyrLe 
u 


855 


cc 




cc 




cc 


TyrLeuGlnLeuV 

alPheGlylleGlu 

Val 


856 


cc 




cc 




cc 


GlnLeuValPheG 


857 


cc 



-75- 











lylleGluValVal 








C( 






GlnLeuValPheG 
lylleGluValValG 
luVal 


858 






CC 




cc 


IlelleValLeuAlal 
lelleAlalle 


859 


cc 




cc 




«£ 


LysDeTrpGluGl 
uLeuSerMetLeu 
GluVal 


860 


cc 




CC 




cc 


AlaLeuIleGluTh 
rSerTyrValLysV 
al 


861 


cc 




CC 




cc 


LeuIleGluThrSer 
TyrValLysVal 


862 


cc 




cc 




t< 


GlyLeuGluAlaA 
rgGlyGluAlaLeu 
GlyLeu 


863 


cc 




CC 




cc 


GlyLeuGluAlaA 
rgGlyGluAlaLeu 


864 


cc 




cc 




cc 


AlaLeuGlyLeuV 
alGlyAlaGlnAla 


865 


cc 




cc 




cc 


GlyLeuValGlyAl 
aGlnAlaProAla 


866 


cc 




cc 




cc 


AspLeuGluSerG 
luPheGlnAlaAla 


867 


cc 




(C 




cc 


AspLeuGluSerG 
luPheGlnAlaAla 
He 


868 


cc 




(C 




cc 


AlalleSerArgLys 
MetValGluLeuV 
al 


869 


cc 




cc 




cc 


AlalleSerArgLys 
MetValGluLeu 


870 


cc 




cc 




cc 


LysMetValGluL 
euValHisPheLeu 
Leu 


871 


cc 




cc 




cc 


LysMetValGluL 
euValHisPheLeu 
LeuLeu 


872 


cc 




cc 




cc 


LeuLeuLeuLysT 
yrArgAlaArgGlu 
ProVal 


873 


cc 




cc 




cc 


LeuLeuLysTyrA 


874 


cc 











rgAlaArgGluPro 
Val 














ValLeuArgAsnC 
ysGlnAspPhePh 
eProVal 


875 


cc 










TyrLeuGlnLeuV 

alPheGlylleGlu 

ValVal 


876 


« 










GlylleGluValVal 
GluValValProIle 


877 


CC 










ProIleSerHisLeu 
TyrlleLeuVal 


878 


cc 










HisLeuTyrlleLeu 
ValThrCysLeu 


879 


cc 










HisLeuTyrlleLeu 
ValThrCysLeuG 
lyLeu 


880 


cc 










TyrlleLeuValThr 
CysLeuGlyLeu 


881 


cc 










CysLeuGlyLeuS 
erTyrAspGlyLeu 


882 


cc 










CysLeuGlyLeuS 
erTyrAspGlyLeu 
Leu 


883 


cc 










ValMetProLysT 
hrGlyLeuLeuIle 


884 


cc 










ValMetProLysT 
hrGlyLeuLeuIlel 
le 


885 


cc 










ValMetProLysT 
hrGlyLeuLeuIlel 
leVal 


886 


cc 










GlyLeuLeuIlelle 
ValLeuAlalle 


887 


cc 










GlyLeuLeuIlelle 
ValLeuAlallelle 


888 


cc 










GlyLeuLeuIlelle 
ValLeuAlallelle 
Ala 


889 


cc 










LeuLeuIlelleVal 
LeuAlallelle 


890 


cc 










LeuLeuIlelleVal 
LeuAlallelleAla 


891 


cc 
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cc 




cc 


LeuLeuIlelleVal 
LeuAlallelleAlal 
le 


892 






cc 




cc 


LeuDelleValLeu 
AlallelleAla 


893 


« 




cc 




cc 


LeuIlelleValLeu 
AlallelleAlaHe 


894 


« 




cc 




cc 


IlelleAlalleGluG 
lyAspCysAla 


895 


» 




cc 




cc 


LysIleTrpGluGl 
uLeuSerMetLeu 


896 


cc 








cc 


LeuMetGlnAspL 

euValGlnGluAs 

nTyrLeu 


897 


cc 




cc 




cc 


PheLeuTrpGlyPr 
oArgAlaLeuIle 


898 


cc 




cc 




cc 


LeuIleGluThrSer 
TyrValLysVal ! 


899 


cc 




cc 




cc 


AlaLeuIleGluTh 
rSerTyrValLysV 
alLeu 


900 


cc 




cc 




cc 


ThrLeuLysIleGl 

yGlyGluProHisIl 

e 


901 


cc 




cc 




cc 


HisIleSerTyrPro 

ProLeuHisGluAr 

gAla 


902 


cc 




cc 




cc 


GlnThrAlaSerSe 
rSerSerThrLeu 


903 


cc 




cc 




cc 


GlnThrAlaSerSe 
rSerSerThrLeuV 
al 


904 


cc 




cc 




cc 


ValThrLeuGlyGl 
uValProAlaAla 


905 


cc 




cc 




cc 


ValThrLysAlaGl 

uMetLeuGluSer 

Val 


906 


cc 




cc 




cc 


ValThrLysAlaGl 

uMetLeuGluSer 

ValLeu 


907 


cc 




cc 




cc 


ValThrCysLeuG 
lyLeuSerTyrAsp 
GlyLeu 


908 


cc 





cc 




cc 


LysThrGlyLeuL 
eullelleValLeu 


909 


cc 




cc 




cc 


LysThrGlyLeuL 
eullelleValLeuA 
la 


910 


cc 




C( 




cc 


LysThrGlyLeuL 
eullelleValLeuA 
lalle 


911 


cc 




CC 




cc 


HisThrLeuLysIle 
GlyGlyGluProHi 
slle 


912 


cc 




CC 




cc 


MetLeuAspLeu \ 

GlnProGluThrT 

hr 


913 


cc 




Mage-3 
peptides 




HLA-A2 


GlyLeuGluAlaA 
rgGlyGluAlaLeu 


914 


5,585,461 








cc 


AlaLeuSerArgLy 
sValAlaGluLeu 


915 


cc 








cc 


PheLeuTrpGlyPr 
oArgAlaLeuVal 


916 


cc 




cc 




cc 


ThrLeuValGluV 
alThrLeuGlyGlu 
Val 


917 


cc 




cc 




cc 


AlaLeuSerArgLy 

sValAlaGluLeu 

Val 


918 


cc 




cc 




cc 


AlaLeuValGluT 
hrSerTyrValLys 
Val 


919 


cc 




Tyrosinase 




HLA-A2 


TyrMetAsnGlyT 
hrMetSerGlnVal 


920 


5,487,974 




cc 




cc 


MetLeuLeuAlaV 
alLeuTyrCysLeu 
Leu 


921 


cc 


















Tyrosinase 




HLA-A2 


MetLeuLeuAlaV 
alLeuTyrCysLeu 


922 


5,530,096 




cc 




cc 


LeuLeuAlaValL 
euTyrCysLeuLe 
u 


923 


cc 




Tyrosinase 




HLA-A2 
and HLA- 
B44 


SerGluIleTrpArg 
AspIleAspPheAl 
aHisGluAla 


924 


5,519,117 
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tc 




CC 


SerGluHeTrpArg 
AspIleAspPhe 


925 


tc 




tc 




(C 


GluGluAsnLeuL 
euAspPheValAr 
gPhe 


926 


cc 




Melan 
A/MART- 1 






EAAGIGILTV 


927 


Jager, E. et al. 
Granulocyte- 
macrophage-colony- 
stimulating Factor 
Enhances Immune 
Responses To 
Melanoma-'associated 
Peptides in vivo Int. J 
Cancer 67, 54-62 
(1996) | 




Tyrosinase 






MLLAVLYCL 


928 


cc 




« 






YMDGTMSQV 


929 


cc 




gplOO/Pme 
117 






YLEPGPVTA 


930 


(( 




« 






LLDGTATLRL 


931 


cc 




Influenza 
matrix 






GILGFVFTL 


932 


cc 




MAGE-1 






EADPTGHSY 


933 


cc 


















MAGE-1 




HLA-A1 


EADPTGHSY 


934 






BAGE 




HLA-C 


MAARAVFLAL 
SAQLLQARLM 
KE 


935 


cc 




cc 




cc 


MAARAVFLAL 
SAQLLQ 


936 


cc 




cc 




cc 


AARAVFLAL 


937 


cc 


Influenza 


PR8NP 


147-154 


K d 


IYQRJRALV 


938 


Falk et al., Allele- 
specific motifs revealed 
by sequencing of self- 
peptides eluted from 
MHC molecules 


SELF 
PEPTIDE 


P815 




« 


SYFPEITHI 


939 


cc 


Influenza 


Jap HA 
523-549 




<l 


IYATVAGSL 


940 


cc 


« 


« 




M 


VYQELAIYA 


941 


cc 


cc 


cc 




M 


IYSTVASSL 


942 


cc 
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cc 


JAP HA 
202-221 




66 


LYQNVGTYV 


943 


cc 




HLA-A24 




66 


RYLENQKRT 


944 


cc 




HLA-Cw3 




66 


RYLKNGKET 


945 


cc 




P815 




66 


KYQAVTTTL 


946 


cc 


Plasmodium 
berghei 


CSP 




cc 


SYIPSAEKI 


947 


cc 


Plasmodium 
yoelii 


CSP 




cc 


SYVPSAFQI 


948 


cc 


Vesicular 
stomatitis 
viruse 


NP 52-59 




K b 


RGYVYQGL 


949 


cc 


Ovalbumin 






cc 


SUNFEKL 


950 


cc 


Sendai virus 


NP321- 
332 




cc 


APGNYPAL 


951 


cc 










VPYGSFKHV 


952 


Morel et al., Processing 
of some antigens by the 
standard proteasome 

but not by the 
immunoproteasome 
results in poor i 
presentation by 
dendritic cells, 
Immunity, vol. 12:107- 
117, 2000. 















































































MOTIFS 








influenza 


PR8NP 




K d 
restricted 
peptide 
motif 


TYQRTRALV 


953 


5,747,269 


self peptide 


P815 




cc 


SYFPEITHI 


954 


cc 


influenza 


JAP HA 




cc 


IYATVAGSL 


955 


cc 


influenza 


JAP HA 




cc 


VYQELAIYA 


956 


cc 


influenza 


PR8HA 




cc 


IYSTVASSL 


957 


cc 


influenza 


JAP HA 




cc 


LYQNVGTYV 


958 


cc 








HLA-A24 


RYLENGKETL 


959 


cc 
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HLA-Cw3 


RYLKNGKETL 


960 


cc 




P815 

tumour 

antigen 




cc 


KYQAVTTTL 


961 


cc 


Plasmodium 
berghei 


CSP 




(( 


SYPSAEKI 


962 


cc 


Plasmodium 
yoelii 


CSP 




cc 


SYVPSAEQI 


963 


cc 


influenza 


NP 




D b - 
restricted 
peptide 
motif 


ASNENMETM 


964 


cc 


adenovirus 


E1A 




cc 


SGPSNTPPEI 


965 


cc 


lymphocytic 

choriomeni 

ngitis 






cc 


SGVENPGGYC 
L 


966 


cc 


simian virus 


40 T 




cc 


SAINNY. . . 


967 


cc 


HIV 


reverse 

transcriptas 

e 




HLA- 
A2.1- 
restricted 
peptide 
motif 


ILKEPVHGV 


968 


cc 




influenza 

matrix 

protein 




cc 


GILGFVFTL 


969 


cc 


influenza 


influenza 

matrix 

protein 




cc 


ILGFVFTLTV 


970 


cc 


HIV 


Gag protein 






FLQSRPEPT 


971 


cc 


HIV 


Gag protein 






AMQMLKE . . 


972 


cc 


HIV 


Gag protein 






PIAPGQMRE 


973 


cc 


HIV 


Gag protein 






QMKDCTERQ 


974 


cc 








HLA- 
A*0205- 
restricted 
peptide 

motif 


VYGVIQK 


975 


cc 



-82- 



Table 8 



VSV-NP peptide (49-62) 

LCMV-NP peptide (1 18-132) 
LCMV glycoprotein peptide. 33-41 



[0125] Still further embodiments are directed to methods, uses, therapies and 
compositions related to epitopes with specificity for MHC, including, for example, those 
listed in Tables 9-13. Other embodiments include one or more of the MHCs listed in Tables 
9-13, including combinations of the same, while other embodiments specifically exclude any 
one or more of the MHCs or combinations thereof. Tables 11-13 include frequencies for the 
listed HLA antigens. 



Table 9 
Class I MHC Molecules 

Class I 

Human 

HLA-A1 

HLA-A*0101 

HLA-A*0201 

HLA-A*0202 

HLA-A*0203 

HLA-A*0204 

HLA-A*0205 

HLA-A*0206 

HLA-A*0207 

HLA-A*0209 

HLA-A*0214 

HLA-A3 

HLA-A*0301 

HLA-A*1101 

HLA-A23 

HLA-A24 

HLA-A25 

HLA-A*2902 

HLA-A*3101 

HLA-A*3302 

HLA-A*6801 

HLA-A*6901 
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HLA-B7 

HLA-B*0702 

HLA-B*0703 

HLA-B*0704 

HLA-B*0705 

HLA-B8 

HLA-B13 

HLA-B14 

HLA-B*1501 (B62) 

HLA-B17 

HLA-B18 

HLA-B22 

HLA-B27 

HLA-B*2702 

HLA-B*2704 

HLA-B*2705 

HLA-B*2709 

HLA-B35 

HLA-B*3501 

HLA-B*3502 

HLA-B*3701 

HLA-B*3801 

HLA-B*39011 

HLA-B*3902 

HLA-B40 

HLA-B*40012(B60) 

HLA-B*4006 (B61) 

HLA-B44 

HLA-B*4402 

HLA-B*4403 

HLA-B*4501 

HLA-B*4601 

HLA-B51 

HLA-B*5101 

HLA-B*5102 

HLA-B*5103 

HLA-B*5201 

HLA-B*5301 

HLA-B*5401 

HLA-B*5501 

HLA-B*5502 

HLA-B*5601 

HLA-B*5801 

HLA-B*6701 



HLA-B*7301 

HLA-B*7801 

HLA-Cw*0102 

HLA-Cw*0301 

HLA-Cw*0304 

HLA-Cw*0401 

HLA-Cw*0601 

HLA-Cw*0602 

HLA-Cw*0702 

HLA-Cw8 

HLA-Cw*1601 M 

HLA-G 

Murine 

H2-K d 

H2-D d 

H2-L d 

H2-K b 

H2-D b 

H2-K k 

H2-K kml 

Qa-l a 

Qa-2 

H2-M3 

Rat 

RTLA a 
RTLA 1 

Bovine 

Bota-Al 1 
Bota-A20 

Chicken 

B-F4 
B-F12 
B-F15 
B-F19 

Chimpanzee 

Patr-A*04 
Patr-A*ll 
Patr-B*01 
Patr-B*13 
Patr-B*16 



Baboon 



Papa-A*06 

Macaque 

Mamu-A*01 

Swine 

SLA (haplotype d/d) 

Virus homolog 

hCMV class I homolog UL18 



Table 10 
Class I MHC Molecules 



Class I 

Human 

HLA-A1 

HLA-A*0101 

HLA-A*0201 

HLA-A*0202 

HLA-A*0204 

HLA-A*0205 

HLA-A*0206 

HLA-A*0207 

HLA-A*0214 

HLA-A3 

HLA-A*1101 

HLA-A24 

HLA-A*2902 

HLA-A*3101 

HLA-A*3302 

HLA-A*6801 

HLA-A*6901 

HLA-B7 

HLA-B*0702 

HLA-B*0703 

HLA-B*0704 

HLA-B*0705 

HLA-B8 

HLA-B14 

HLA-B*1501 (B62) 

HLA-B27 

HLA-B*2702 

HLA-B*2705 

HLA-B35 

HLA-B*3501 

HLA-B*3502 

HLA-B*3701 

HLA-B*3801 

HLA-B*3901 1 

HLA-B*3902 

HLA-B40 

HLA-B*40012 (B60) 
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HLA-B*4006 (B61) 

HLA-B44 

HLA-B*4402 

HLA-B*4403 

HLA-B*4601 

HLA-B51 

HLA-B*5101 

HLA-B*5102 

HLA-B*5103 

HLA-B*5201 

HLA-B*5301 

HLA-B*5401 

HLA-B*5501 

HLA-B*5502 

HLA-B*5601 

HLA-B*5801 

HLA-B*6701 

HLA-B*7301 

HLA-B*7801 

HLA-Cw*0102 

HLA-Cw*0301 

HLA-Cw*0304 

HLA-Cw*0401 

HLA-Cw*0601 

HLA-Cw*0602 

HLA-Cw*0702 

HLA-G 

Murine 

H2-K d 

H2-D d 

H2-L d 

H2-K b 

H2-D b 

H2-K k 

H2-K kml 

Qa-2 

Rat 

RTl.A 3 
RT1.A 1 

Bovine 

Bota-All 
Bota-A20 



Chicken 

B-F4 
B-F12 
B-F15 
B-F19 

Virus homolog 

hCMV class I homolog UL18 

Table 11 

Estimated gene frequencies of HLA-A antigens 





CAU 


AFR 


ASI 


LAT 


NAT 


Antigen 


Gf 


SE b 


Gf 


SE 


Gf 


SE 


Gf 


SE 


Gf 


SE 


Al 


15.1843 


0.0489 


5.7256 


0.0771 


4.4818 


0.0846 


7.4007 


0.0978 


12.0316 


0.2533 


A2 


28.6535 


0.0619 


18.8849 


0.1317 


24.6352 


0.1794 


28.1198 


0.1700 


29.3408 


0.3585 


A3 


13.3890 


0.0463 


8.4406 


0.0925 


2.6454 


0.0655 


8.0789 


0.1019 


11.0293 


0.2437 


A28 


A A £. CI 

4.4652 


0.0280 


9.9269 


0.0997 


1.7657 


0.0537 


8.9446 


A 1 A<£*7 

0.1067 


5.3856 


A 1 "7CA 
0.1 /50 




0.0221 


0.0020 


1.8836 


0.0448 


0.0148 


0.0049 


0.1584 


0.0148 


0.1545 


0.0303 


A23 


1.8287 


0.0181 


10.2086 


0.1010 


0.3256 


0.0231 


2.9269 


0.0628 


1.9903 


0.1080 


A24 


9.3251 


0.0395 


2.9668 


0.0560 


22.0391 


0.1722 


13.2610 


0.1271 


12.6613 


0.2590 


A9 unsplit 


0.0809 


0.0038 


0.0367 


0.0063 


0.0858 


0.0119 


0.0537 


0.0086 


0.0356 


0.0145 


A9 total 


11.2347 


0.0429 


13.2121 


0.1128 


22.4505 


0.1733 


16.2416 


0.1382 


14.6872 


0.2756 


A25 


2.1157 


0.0195 


0.4329 


0.0216 


0.0990 


0.0128 


1.1937 


0.0404 


1.4520 


0.0924 


A26 


3.8795 


0.0262 


2.8284 


0.0547 


4.6628 


0.0862 


3.2612 


0.0662 


2.4292 


0.1191 


A34 


0.1508 


0.0052 


3.5228 


0.0610 


1.3529 


0.0470 


0.4928 


0.0260 


0.3150 


0.0432 


A43 


0.0018 


0.0006 


0.0334 


0.0060 


0.0231 


0.0062 


0.0055 


0.0028 


0.0059 


0.0059 


A66 


0.0173 


0.0018 


0.2233 


0.0155 


0.0478 


0.0089 


0.0399 


0.0074 


0.0534 


0.0178 


A10 unsplit 


0.0790 


0.0038 


0.0939 


0.0101 


0.1255 


0.0144 


0.0647 


0.0094 


0.0298 


0.0133 


A 10 total 


6.2441 


0.0328 


7.1348 


0.0850 


6.3111 


0.0993 


5.0578 


0.0816 


4.2853 


0.1565 


A29 


3.5796 


0.0252 


3.2071 


0.0582 


1.1233 


0.0429 


4.5156 


0.0774 


3.4345 


0.1410 


A30 


2.5067 


0.0212 


13.0969 


0.1129 


2.2025 


0.0598 


4.4873 


0.0772 


2.5314 


0.1215 


A31 


2.7386 


0.0221 


1.6556 


0.0420 


3.6005 


0.0761 


4.8328 


0.0800 


6.0881 


0.1855 


A32 


3.6956 


0.0256 


1.5384 


0.0405 


1.0331 


0.0411 


2.7064 


0.0604 


2.5521 


0.1220 


A33 


1.2080 


0.0148 


6.5607 


0.0822 


9.2701 


0.1191 


2.6593 


0.0599 


1.0754 


0.0796 


A74 


0.0277 


0.0022 


1.9949 


0.0461 


0.0561 


0.0096 


0.2027 


0.0167 


0.1068 


0.0252 


A19 unsplit 


0.0567 


0.0032 


0.2057 


0.0149 


0.0990 


0.0128 


0.1211 


0.0129 


0.0475 


0.0168 


A19 total 


13.8129 


0.0468 


28.2593 


0.1504 


17.3846 


0.1555 


19.5252 


0.1481 


15.8358 


0.2832 


AX 


0.8204 


0.0297 


4.9506 


0.0963 


2.9916 


0.1177 


1.6332 


0.0878 


1.8454 


0.1925 



a Gene frequency. 
b Standard error. 
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Table 12 



Estimated gene frequencies for HLA-B antigens 





CAU 


AFR 


ASI 


LAT 


NAT 


Antigen 


Gf 


SE b 


Gf 


SE 


Gf 


SE 


Gf 


SE 


Gf 


SE 


B7 


12.1782 


0.0445 


10.5960 


0.1024 


4.2691 


0.0827 


6.4477 


0.0918 


10.9845 


0.2432 


B8 


9.4077 


0.0397 


3.8315 


0.0634 


1.3322 


0.0467 


3.8225 


0.0715 


8.5789 


0.2176 


B13 


2.3061 


0.0203 


0.8103 


0.0295 


4.9222 


0.0886 


1.2699 


0.0416 


1.7495 


0.1013 


B14 


4.3481 


0.0277 


3.0331 


0.0566 


0.5004 


0.0287 


5.4166 


0.0846 


2.9823 


0.1316 


B18 


4.7980 


0.0290 


3.2057 


0.0582 


1.1246 


0.0429 


4.2349 


0.0752 


3.3422 


0.1391 


B27 


4.3831 


0.0278 


1.2918 


0.0372 


2.2355 


0.0603 


2.3724 


0.0567 


5.1970 


0.1721 


B35 


9.6614 


0.0402 


8.5172 


0.0927 


8.1203 


0.1122 


14.6516 


0.1329 


10.1198 


0.2345 


B37 


1.4032 


0.0159 


0.5916 


0.0252 


1.2327 


0.0449 


0.7807 


0.0327 


0.9755 


0.0759 


B41 


0.9211 


0.0129 


0.8183 


0.0296 


0.1303 


0.0147 


1.2818 


0.0418 


0.4766 


0.0531 


B42 


0.0608 


0.0033 


5.6991 


0.0768 


0.0841 


0.0118 


0.5866 


0.0284 


0.2856 


0.0411 


B46 


a a aaa 

0.0099 


0.0013 


0.0151 


0.0040 


4.9292 


0.0886 


0.0234 ! 


0.0057 


0.0238 


0.0119 


t> a n 

B47 


0.2069 


0.0061 


0.1305 


0.0119 


0.0956 


0.0126 


0.1832 


0.0159 


0.2139 


0.0356 


Ti AO 

B48 


0.0865 


0.0040 


0.1316 


0.0119 


2.0276 


0.0575 


1.5915 


0.0466 


1.0267 


a nno 

0.0778 




0.4620 


0.0092 


10.9529 


0.1039 


0.4315 


0.0266 


1.6982 


0.048 1 


1 AO A/I 

1.0804 


A A*7AO 

0.0798 




0.0020 


0.0006 


0.0032 


0.0019 


a a 

0.4277 


0.0265 


a r\r\c c 

0.0055 


0.0028 


AC 
0 




Ho/ 


0.0040 


0.0009 


0.0086 


0.0030 


0.2276 


0.0194 


0.0055 


0.0028 


0.0059 


0.0059 


B70 


0.3270 


0.0077 


7.3571 


0.0866 


0.8901 


0.0382 


1.9266 


0.0512 


0.6901 


0.0639 


B73 


0.0108 


0.0014 


0.0032 


0.0019 


0.0132 


0.0047 


0.0261 


0.0060 


0 C 




B51 


5.4215 


0.0307 


2.5980 


0.0525 


7.4751 


0.1080 


6.8147 


0.0943 


6.9077 


0.1968 


B52 


0.9658 


0.0132 


1.3712 


0.0383 


3.5121 


0.0752 


2.2447 


0.0552 


0.6960 


0.0641 


B5 unsplit 


0.1565 


0.0053 


0.1522 


0.0128 


0.1288 


0.0146 


0.1546 


0.0146 


0.1307 


0.0278 


B5 total 


6.5438 


0.0435 


4.1214 


0.0747 


11.1160 


0.1504 


9.2141 


0.1324 


7.7344 


0.2784 


B44 


13.4838 


0.0465 


7.0137 


0.0847 


5.6807 


0.0948 


9.9253 


0.1121 


11.8024 


0.2511 


B45 


0.5771 


0.0102 


4.8069 


0.0708 


0.1816 


0.0173 


1.8812 


0.0506 


0.7603 


0.0670 


B12 unsplit 


0.0788 


0.0038 


0.0280 


0.0055 


0.0049 


0.0029 


0.0193 


0.0051 


0.0654 


0.0197 


nil . „, 1 

B12 total 


14.1440 


0.0474 


11.8486 


0.1072 


5.8673 


0.0963 


11.8258 


0.1210 


12.6281 


0.2584 


B62 


5.9117 


0.0320 


1.5267 


0.0404 


9.2249 


0.1190 


4.1825 


0.0747 


6.9421 
0.3738 


0.1973 


B63 


0.4302 


0.0088 


1.8865 


0.0448 


0.4438 


0.0270 


0.8083 


0.0333 


0.0356 


0.0471 


B75 


0.0104 


0.0014 


0.0226 


0.0049 


1.9673 


0.0566 


0.1101 


0.0123 


0 


0.0145 


B76 


0.0026 


0.0007 


0.0065 


0.0026 


0.0874 


0.0120 


0.0055 


0.0028 


AC 

0 




B77 


0.0057 


0.0010 


0.0119 


0.0036 


0.0577 


0.0098 


0.0083 


0.0034 


0.0059 


0.0059 


B15 unsplit 


0.1305 


0.0049 


0.0691 


0.0086 


0.4301 


0.0266 


0.1820 


0.0158 


A A*7 1 C 

0.0715 


0.0206 


B15 total 


6.4910 


0.0334 


3.5232 


0.0608 


12.2112 


0.1344 


5.2967 


0.0835 


*7 A inn 

7.4290 


0.2035 


B38 


2.4413 


f\ A A- A A 

0.0209 


0.3323 


0.0189 


3.2818 


0.0728 


1.9652 


0.0517 


1.1017 


0.0806 


B39 


1.9614 


0.0188 


1 .2893 


0.0371 


2.0352 


0.0576 


6.3040 


0.0909 


4.5527 


0.1615 


B 1 6 unsplit 


U.Uoio 




yj.uZi 1 


U.UUj 1 


A (\(LAA 


U.UiUj 


u. izzo 


U.Ul j\) 


A ACQ1 


U.Uloo 


£510 lOUU 


4.4667 


0.0280 


1.6453 


0.0419 


5.3814 


0.0921 


8.3917 


0.1036 


5.7137 


0.1797 


B57 


3.5955 


0.0252 


5.6746 


0.0766 


2.5782 


0.0647 


2.1800 


0.0544 


2.7265 


0.1260 


B58 


0.7152 


0.0114 


5.9546 


0.0784 


4.0189 


0.0803 


1.2481 


0.0413 


0.9398 


0.0745 


B17 unsplit 


0.2845 


0.0072 


0.3248 


0.0187 


0.3751 


0.0248 


0.1446 


0.0141 


0.2674 


0.0398 


B17 total 


4.5952 


0.0284 


11.9540 


0.1076 


6.9722 


0.1041 


3.5727 


0.0691 


3.9338 


0.1503 


B49 


1.6452 


0.0172 


2.6286 


0.0528 


0.2440 


0.0200 


2.3353 


0.0562 


1.5462 


0.0953 


B50 


1.0580 


0.0138 


0.8636 


0.0304 


0.4421 


0.0270 


1.8883 


0.0507 


0.7862 


0.0681 


B21 unsplit 


0.0702 


0.0036 


0.0270 


0.0054 


0.0132 


0.0047 


0.0771 


0.0103 


0.0356 


0.0145 


B21 total 


2.7733 


0.0222 


3.5192 


0.0608 


0.6993 


0.0339 


4.3007 


0.0755 


2.3680 


0.1174 


B54 


0.0124 


0.0015 


0.0183 


0.0044 


2.6873 


0.0660 


0.0289 


0.0063 


0.0534 


0.0178 
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Antippn 


CAU 


AFR 


ASI 


LAT 


NAT 


Gf 


SE b 


Gf 


SE 


Gf 


SE 


Gf 


SE 


Gf 


SE 


B55 


1.9046 


0.0185 


0.4895 


0.0229 


2.2444 


0.0604 


0.9515 


0.0361 


1.4054 


0.0909 


B56 


0.5527 


0.0100 


0.2686 


0.0170 


0.8260 


0.0368 


0.3596 


0.0222 


0.3387 


0.0448 


B22 unsplit 


0.1682 


0.0055 


0.0496 


0.0073 


0.2730 


0.0212 


0.0372 


0.0071 


0.1246 


0.0272 


B22 total 


2.0852 


0.0217 


0.8261 


0.0297 


6.0307 


0.0971 


1.3771 


0.0433 


1.9221 


0.1060 


B60 


5.2222 


0.0302 


1.5299 


0.0404 


8.3254 


0.1135 


2.2538 


0.0553 


5.7218 


0.1801 


B61 


1.1916 


0.0147 


0.4709 


0.0225 


6.2072 


0.0989 


4.6691 


0.0788 


2.6023 


0.1231 


B40 unsplit 


0.2696 


0.0070 


0.0388 


0.0065 


0.3205 


0.0230 


0.2473 


0.0184 


0.2271 


0.0367 


B40 total 


6.6834 


0.0338 


2.0396 


0.0465 


14.8531 


0.1462 


7.1702 


0.0963 


8.5512 


0.2168 


BX 


1.0922 


0.0252 


3.5258 


0.0802 


3.8749 


0.0988 


2.5266 


0.0807 


1.9867 


0.1634 



ions) a Gene frequency. b Standard error. c The observed gene count was 

zero. 



Table 13 



Estimated gene frequencies of HLA-DR antigens 





CAU 


AFR 


ASI 


LAT 


NAT 


Antigen 


Gf 


SE b 


Gf 


SE 


Gf 


SE 


Gf 


SE 


Gf 


SE 


DR1 


10.2279 


0.0413 


6.8200 


0.0832 


3.4628 


0.0747 


7.9859 


0.1013 


8.2512 


0.2139 


DR2 


15.2408 


0.0491 


16.2373 


0.1222 


18.6162 


0.1608 


11.2389 


0.1182 


15.3932 


0.2818 


DR3 


10.8708 


0.0424 


13.3080 


0.1124 


4.7223 


0.0867 


7.8998 


0.1008 


10.2549 


0.2361 


DR4 


16.7589 


0.0511 


5.7084 


0.0765 


15.4623 


0.1490 


20.5373 


0.1520 


19.8264 


0.3123 


DR6 


14.3937 


0.0479 


18.6117 


0.1291 


13.4471 


0.1404 


17.0265 


0.1411 


14.8021 


0.2772 


DR7 


13.2807 


0.0463 


10.1317 


0.0997 


6.9270 


0.1040 


10.6726 


0.1155 


10.4219 


0.2378 


DR8 


2.8820 


0.0227 


6.2673 


0.0800 


6.5413 


0.1013 


9.7731 


0.1110 


6.0059 


0.1844 


DR9 


1.0616 


0.0139 


2.9646 


0.0559 


9.7527 


0.1218 


1.0712 


0.0383 


2.8662 


0.1291 


DR10 


1.4790 


0.0163 


2.0397 


0.0465 


2.2304 


0.0602 


1.8044 


0.0495 


1.0896 


0.0801 


DR11 


9.3180 


0.0396 


10.6151 


0.1018 


4.7375 


0.0869 


7.0411 


0.0955 


5.3152 


0.1740 


DR12 


1.9070 


0.0185 


4.1152 


0.0655 


10.1365 


0.1239 


1.7244 


0.0484 


2.0132 


0.1086 


DR5 unsplit 


1.2199 


0.0149 


2.2957 


0.0493 


1.4118 


0.0480 


1.8225 


0.0498 


1.6769 


0.0992 


DR5 total 


12.4449 


0.0045 


17.0260 


0.1243 


16.2858 


0.1516 


10.5880 


0.1148 


9.0052 


0.2218 


DRX 


1.3598 


0.0342 


0.8853 


0.0760 


2.5521 


0.1089 


1.4023 


0.0930 


2.0834 


0.2037 



a Gene frequency. 
b Standard error. 



[0131] It can be desirable to express housekeeping peptides in the context of a 
larger protein. Processing can be detected even when a small number of amino acids are 
present beyond the terminus of an epitope. Small peptide hormones are usually 
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proteolytically processed from longer translation products, often in the size range of 
approximately 60-120 amino acids. This fact has led some to assume that this is the 
minimum size that can be efficiently translated. In some embodiments, the housekeeping 
peptide can be embedded in a translation product of at least about 60 amino acids, in others 
70, 80, 90 amino acids, and in still others 100, 1 10 or 120 amino acids, for example. In other 
embodiments the housekeeping peptide can be embedded in a translation product of at least 
about 50, 30, or 15 amino acids. 

[0132] Due to differential proteasomal processing, the immunoproteasome of the 
pAPC produces peptides that are different from those produced by the housekeeping 
proteasome in peripheral body cells. Thus, in expressing a housekeeping peptide in the 
context of a larger protein, it is preferably expressed in the pAPC in a context other than its 
full-length native sequence, because, as a housekeeping epitope, it is generally only 
efficiently processed from the native protein by the housekeeping proteasome, which is not 
active in the pAPC. In order to encode the housekeeping epitope in a DNA sequence 
encoding a larger polypeptide, it is useful to find flanking areas on either side of the sequence 
encoding the epitope that permit appropriate cleavage by the immunoproteasome in order to 
liberate that housekeeping epitope. Such a sequence promoting appropriate processing is 
referred to hereinafter as having substrate or liberation sequence function. Altering flanking 
amino acid residues at the N-terminus and C-terminus of the desired housekeeping epitope 
can facilitate appropriate cleavage and generation of the housekeeping epitope in the pAPC. 
Sequences embedding housekeeping epitopes can be designed de novo and screened to 
determine which can be successfully processed by immunoproteasomes to liberate 
housekeeping epitopes. 

[0133] Alternatively, another strategy is very effective for identifying sequences 
allowing production of housekeeping epitopes in APC. A contiguous sequence of amino 
acids can be generated from head to tail arrangement of one or more housekeeping epitopes. 
A construct expressing this sequence is used to immunize an animal, and the resulting T cell 
response is evaluated to determine its specificity to one or more of the epitopes in the array. 
These immune responses indicate housekeeping epitopes that are processed in the pAPC 
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effectively. The necessary flanking areas around this epitope are thereby defined. The use of 
flanking regions of about 4-6 amino acids on either side of the desired peptide can provide 
the necessary information to facilitate proteasome processing of the housekeeping epitope by 
the immunoproteasome. Therefore, a substrate or liberation sequence of approximately 16-22 
amino acids can be inserted into, or fused to, any protein sequence effectively to result in that 
housekeeping epitope being produced in an APC. In some embodiments, a broader context 
of a substrate sequence can also influence processing. In such embodiments, comparisons of 
a liberaton sequence in a variety of contexts can be useful in further optimizing a particular 
substrate sequence. In alternate embodiments the whole head-to-tail array of epitopes, or just 
the epitopes immediately adjacent to the correctly processed housekeeping epitope can be 
similarly transferred from a test construct to a vaccine vector. 

[0134] In a preferred embodiment, the housekeeping epitopes can be embedded 
between known immune epitopes, or segments of such, thereby providing an appropriate 
context for processing. The abutment of housekeeping and immune epitopes can generate the 
necessary context to enable the immunoproteasome to liberate the housekeeping epitope, or a 
larger fragment, preferably including a correct C-terminus. It can be useful to screen 
constructs to verify that the desired epitope is produced. The abutment of housekeeping 
epitopes can generate a site cleavable by the immunoproteasome. Some embodiments of the 
invention employ known epitopes to flank housekeeping epitopes in test substrates; in others, 
screening as described below is used, whether the flanking regions are arbitrary sequences or 
mutants of the natural flanking sequence, and whether or not knowledge of proteasomal 
cleavage preferences are used in designing the substrates. 

[0135] Cleavage at the mature N-terminus of the epitope, while advantageous, is 
not required, since a variety of N-terminal trimming activities exist in the cell that can 
generate the mature N-terminus of the epitope subsequent to proteasomal processing. It is 
preferred that such N-terminal extension be less than about 25 amino acids in length and it is 
further preferred that the extension have few or no proline residues. Preferably, in screening, 
consideration is given not only to cleavage at the ends of the epitope (or at least at its C- 
terminus), but consideration also can be given to ensure limited cleavage within the epitope. 
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[0136] Shotgun approaches can be used in designing test substrates and can 
increase the efficiency of screening. In one embodiment multiple epitopes can be assembled 
one after the other, with individual epitopes possibly appearing more than once. The 
substrate can be screened to determine which epitopes can be produced. In the case where a 
particular epitope is of concern, a substrate can be designed in which it appears in multiple 
different contexts. When a single epitope appearing in more than one context is liberated 
from the substrate additional secondary test substrates, in which individual instances of the 
epitope are removed, disabled, or are unique, can be used to determine which are being 
liberated and truly confer substrate or liberation sequence function. 

[0137] Several readily practicable screens exist. A preferred in vitro screen 
utilizes proteasomal digestion analysis, using purified immunoproteasomes, to determine if 
the desired housekeeping epitope can be liberated from a synthetic peptide embodying the 
sequence in question. The position of the cleavages obtained can be determined by 
techniques such as mass spectrometry, HPLC, and N-terminal pool sequencing; as described 
in greater detail in U.S. Patent Application Nos. 09/561,074, 09/560,465 and 10/1 17,937, and 
Provisional U.S. Patent Application Nos. 60/282,211, 60/337,017, and 60/363, 210, which 
were all cited and incorporated by reference above. 

[0138] Alternatively, in vivo and cell-based screens such as immunization or 
target sensitization can be employed. For immunization a nucleic acid construct capable of 
expressing the sequence in question is used. Harvested CTL can be tested for their ability to 
recognize target cells presenting the housekeeping epitope in question. Such targets cells are 
most readily obtained by pulsing cells expressing the appropriate MHC molecule with 
synthetic peptide embodying the mature housekeeping epitope. Alternatively, immunization 
can be carried out using cells known to express housekeeping proteasome and the antigen 
from which the housekeeping epitope is derived, either endogenously or through genetic 
engineering. To use target sensitization as a screen, CTL, or preferably a CTL clone, that 
recognizes the housekeeping epitope can be used. In this case it is the target cell that 
expresses the embedded housekeeping epitope (instead of the pAPC during immunization) 
and it must express immunoproteasome. Generally, the cell or target cell can be transformed 
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with an appropriate nucleic acid construct to confer expression of the embedded 
housekeeping epitope. Loading with a synthetic peptide embodying the embedded epitope 
using peptide loaded liposomes, or complexed with cationic lipid protein transfer reagents 
such as BIOPORTER™ (Gene Therapy Systems, San Diego, CA), represents an alternative. 

[0139] Once sequences with substrate or liberation sequence function are 
identified they can be encoded in nucleic acid vectors, chemically synthesized, or produced 
recombinantly. In any of these forms they can be incorporated into immunogenic 
compositions. Such compositions can be used in vitro in vaccine development or in the 
generation or expansion of CTL to be used in adoptive immunotherapy. In vivo they can be 
used to induce, amplify or sustain and active immune response. The uptake of polypeptides 
for processing and presentation can be greatly enhanced by packaging with cationic lipid, the 
addition of a tract of cationic amino acids such as poly-L-lysine (Ryser, H J. et al., J. Cell 
Physiol 113:167-178, 1982; Shen, W.C. & Ryser, H.J., Proc. Natl. Aced. Set USA 75:1872- 
1876, 1978), the incorporation into branched structures with importation signals (Sheldon, K. 
et al., Proc. Natl Aced. Sci. USA 92:2056-2060, 1995), or mixture with or fusion to 
polypeptides with protein transfer function including peptide carriers such as pep-1 (Morris, 
M.C., et al, Nat. Biotech. 19:1173-1176, 2001), the PreS2 translocation motif of hepatitis B 
virus surface antigen, VP22 of herpes viruses, and HIV-TAT protein (Oess, S. & Hildt, E., 
Gene Ther. 7:750-758, 2000; Ford, K.G., et al, Gene Ther. 8:1-4, 2001; Hung, C.F. et al., J. 
Virol 76:2676-2682, 2002; Oliveira, S.C., et a;. Hum. Gene Ther. 12:1353-1359, 2001; 
Normand, N. et al, J. Biol Chem. 276:15042-15050, 2001; Schwartz, JJ. & Zhang, S., Curr. 
Opin. Mol Ther. 2:162-167, 2000; Elliot G., 7 Hare, P. Cell 88:223-233, 1997), among other 
methodologies. Particularly for fusion proteins the immunogen can be produced in culture 
and the purified protein administered or, in the alternative, the nucleic acid vector can be 
administered so that the immunogen is produced and secreted by cells transformed in vivo. In 
either scenario the transport function of the fusion protein facilitates uptake by pAPC. 

[0140] The following examples are intended for illustration purposes only, and 
should not be construed as limiting the scope of the invention in any way. 
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EXAMPLES 
Example 1 

[0141] A recombinant DNA plasmid vaccine, pMA2M, which encodes one 
polypeptide with an HLA A2-specific CTL epitope ELAGIGELTV (SEQ ID NO. 1) from 
melan-A (26-35 A27L), and a portion (amino acids 31-96) of melan-A (SEQ ID NO. 2) 
including the epitope clusters at amino acids 31-48 and 56-69, was constructed. These 
clusters were previously disclosed in U.S. Patent Application No. 09/561,571 entitled 
EPITOPE CLUSTERS incorporated by reference above. Flanking the defined melan-A CTL 
epitope are short amino acid sequences derived from human tyrosinase (SEQ ID NO. 3) to 
facilitate liberation of the melan-A housekeeping epitope by processing by the 
immunoproteasome. In addition, these amino acid sequences represent potential CTL 
epitopes themselves. The cDNA sequence for the polypeptide in the plasmid is under the 
control of promoter/enhancer sequence from cytomegalovirus (CMVp) (see Figure 4), which 
allows efficient transcription of messenger for the polypeptide upon uptake by APCs. The 
bovine growth hormone polyadenylation signal (BGH polyA) at the 3' end of the encoding 
sequence provides a signal for polyadenylation of the messenger to increase its stability as 
well as for translocation out of nucleus into the cytoplasm for translation. To facilitate 
plasmid transport into the nucleus after uptake, a nuclear import sequence (NIS) from simian 
virus 40 (SV40) has been inserted in the plasmid backbone. The plasmid carries two copies 
of a CpG immunostimulatory motif, one in the NIS sequence and one in the plasmid 
backbone. Lastly, two prokaryotic genetic elements in the plasmid are responsible for 
amplification in E.coli, the kanamycin resistance gene (Kan R) and the pMBl bacterial origin 
of replication. 

SUBSTRATE or LIBERATION sequence 

[0142] The amino acid sequence of the encoded polypeptide (94 amino acid 
residues in length) (SEQ ED NO. 4) containing a 28 amino acid substrate or liberation 
sequence at its N-terminus (SEQ ID NO. 5) is given below: 
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[0143] MLLAVLYCL-ELAGIGILTV-YMDGTMSQV- 
GILTVILGVLLLIGCWYCRRRNGYRALMDKSLHV 
LQEKNCEPV 

[0144] The first 9 amino acid residues are derived from tyrosinase^ (SEQ ED 
NO. 6), the next ten constitute melan-A (26-35A27L) (SEQ ID NO. 1), and amino acid 
residues 20 to 29 are derived from tyrosinase369-377 (SEQ ID NO. 7). These two tyrosinase 
nonamer sequences both represent potential HLA A2-specific CTL epitopes. Amino acid 
residues 10-19 constitute melan-A (26-35A27L) an analog of an HLA A2-specific CTL 
epitope from melan-A, EAAGIGILTV (SEQ ID NO. 8), with an elevated potency in inducing 
CTL responses during in vitro immunization of human PBMC and in vivo immunization in 
mice. The segment of melan-A constituting the rest of the polypeptide (amino acid residues 
30 to 94) contain a number of predicted HLA A2-specific epitopes, including the epitope 
clusters cited above, and thus can be useful in generating a response to immune epitopes as 
described at length in the patent applications 'Epitope Synchronization in Antigen Presenting 
Cells' and 'Epitope Clusters' cited and incorporated by reference above. This region was 
also included to overcome any difficulties that can be associated with the expression of 
shorter sequences. A drawing of pMA2M is shown in Figure 4. 

Plasmid construction 

[0145] A pair of long complementary oligonucleotides was synthesized which 
encoded the first 30 amino acid residues. In addition, upon annealing, these oligonucleotides 
generated the cohensive ends of Afl II at the 5' end and that of EcoR I at the 3' end. The 
melan A31.96 region was amplified with PCR using oligonucleotides carrying restriction sites 
for EcoR I at the 5' end and Not I at the 3' end. The PCR product was digested with EcoR I 
and Not I and ligated into the vector backbone, described in Example 1, that had been 
digested with Afl II and Not I, along with the annealed oligonucleotides encoding the amino 
terminal region in a three-fragment ligation. The entire coding sequence was verified by 
DNA sequencing. The sequence of the entire insert, from the Afl II site at the 5' end to the 
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Not I site at the 3' end is disclosed as SEQ ID NO. 9. Nucleotides 12-293 encode the 
polypeptide. 

Example 2 

[0146] Three vectors containing melan-A (26-35A27L) (SEQ ID NO. 1) as an 
embedded housekeeping epitope were tested for their ability to induce a CTL response to this 
epitope in HLA-A2 transgenic HHD mice (Pascolo et al. J. Exp. Med. 185:2043-2051, 1997). 
One of the vectors was pMA2M described above (called pVAXM3 in Figure 6). In pVAXM2 
the same basic group of 3 epitopes was repeated several times with the flanking epitopes 
truncated by differing degrees in the various repeats of the array. Specifically the cassette 
consisted of: 

[0147] M-Tyr(5-9)-ELA-Tyr^ 
Tyr(369-375)-Tyr(2-9)-ELA 
(SEQ ID NO. 10) 

[0148] where ELA represents melan-A (26-35A27L) (SEQ ID NO. 1). This 
cassette was inserted in the same plasmid backbone as used for pVAXM3. The third, 
pVAXMl is identical to pVAXM2 except that the epitope array is followed by an IRES 
(internal ribosome entry site for encephalomyocarditis virus) linked to a reading frame 
encoding melan-A 31-70. 

[0149] Four groups of three HHD A2.1 mice were injected intranodally in 
surgically exposed inguinal lymph nodes with 25 ^1 of 1 mg/ml plasmid DNA in PBS on 
days 0, 3, and 6, each group receiving one of the three vectors or PBS alone. On day 14 the 
spleens were harvested and restimulated in vitro one time with 3-day LPS blasts pulsed with 
peptide (melan-A (26-35A27L)(SEQ ID NO. 1)). The in vitro cultures were supplemented 
with Rat T-Stim (Collaborative Biomedical Products) on the 3 rd day and assayed for cytolytic 
activity on the 7 th day using a standard 51 Cr-release assay. Figures 5 to 8 show % specific 
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lysis obtained using the cells immunized with PBS, pVAXMl, pVAXM2, and pVAXM3, 
respectively on T2 target cells and T2 target cells pulsed with melan-A (26-35A27L) (ELA) 
(SEQ ID NO. 1). All three vectors generated strong CTL responses. These data indicated 
that the plasmids have been taken up by APCs, the encoded polypeptide has been synthesized 
and proteolytically processed to produce the decamer epitope in question (that is, it had 
substrate or liberation sequence function), and that the epitope became HLA-A2 bound for 
presentation. Also, an isolated variant of pVAXM2, that terminates after the 55 th amino acid, 
worked similarly well as the full length version (data not shown). Whether other potential 
epitopes within the expression cassette can also be produced and be active in inducing CTL 
responses can be determined by testing for CTL activity against target cells pulsed with 
corresponding synthetic peptides. 



Example 3 

An NY-ESO-1 (SEP ID NO. 1 1) SUBSTRATE/LIBERATION Sequence 

[0150] Six other epitope arrays were tested leading to the identification of a 
substrate/liberation sequence for the housekeeping epitope NY-ESO-1 157-165 (SEQ ID NO. 
12). The component epitopes of the arrays were: 



[0151] SSX-2 4i -49: KASEKIFYV (SEQ ID NO. 13) Array element A 

[0152] NY-ESO-1 157-155: SLLMWITQC (SEQ ID NO. 12) Array element B 

[0153] NY-ESO-1 163-171 : TQCFLPVFL (SEQ ID NO. 14) Array element C 

[0154] PSMA288-297: GLPSIPVHPI (SEQ ED NO. 15) Array element D 

[0155] TYR4.9: AVLYCL (SEQ ED NO. 16) Array element E 



[0156] The six arrays had the following arrangements of elements after starting 
with an initiator methionine: 



[0157] pVAX-PC-A: B-A-D-D-A-B-A-A 



-99- 



[0158] 


pVAX-PC-B: 


D-A-B-A-A-D-B-A 


roi ^Qi 




PAFIRARFA A 

IZ ^ LJ D f\ D Cj r\-f\ 


[0160] 


pVAX-BC-A: 


B-A-C-B-A-A-C-A 


[0161] 


pVAX-BC-B: 


C-A-B-C-A-A-B-A 


[0162] 


pVAX-BC-C: 


E-A-A-B-C-B-A-A 



[0163] These arrays were inserted into the same vector backbone described in the 
examples above. The plasmid vectors were used to immunize mice essentially as described 
in Example 2 and the resulting CTL were tested for their ability to specifically lyse target 
cells pulsed with the peptide NY-ESO-1 157-165, corresponding to element B above. Both 
pVAX-PC-A and pVAX-BC-A were found to induce specific lytic activity. Comparing the 
contexts of the epitope (element B) in the various arrays, and particularly between pVAX- 
PC-A and pVAX-BC-A, between pVAX-PC-A and pVAX-PC-B, and between pVAX-BC-A 
and pVAX-BC-C, it was concluded that it was the first occurrence of the epitope in pVAX- 
PC-A and pVAX-BC-A that was being correctly processed and presented. In other words an 
initiator methionine followed by elements B-A constitute a substrate/liberation sequence for 
the presentation of element B. On this basis a new expression cassette for use as a vaccine 
was constructed encoding the following elements: 

[01 64] An initiator methionine, 

[0165] NY-ESO-1 157-165 (bold) - a housekeeping epitope, 
[0166] SSX2 4M 9 (italic) - providing appropriate context for processing, and 
[0167] NY-ESO-1 77- i8o - to avoid "short sequence" problems and provide 
immune epitopes. 

[0168] Thus the construct encodes the amino acid sequence: 

[01 69] M-SLLMWITQC-^45£A7F7F- 
RCGARGPESRLLEFYLAMPFATPMEAELARRSLAQDAPPLPVPGVLLKEFTVSGNILT 
IRLTAADHRQLQLSISSCLQQLSLLMWITQCFLPVFLAQPPSGQRR (SEQ ID NO. 17) 
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and MSLLMWITQCKASEKIFYV (SEQ ID NO. 18) constitutes the liberation or substrate 
sequence. A polynucleotide encoding SEQ ID NO. 17 (SEQ ID NO. 19: nucleotides 12-380) 
was inserted into the same plasmid backbone as used for pMA2M generating the plasmid 
pN157. 

Example 4 

[0170] A construct similar to pN157 containing the whole epitope array from 
pVAX-PC-A was also made and designated pBPL. Thus the encoded amino acid sequence in 
pBPL is: 

[0171] M-SLLMWITQC-i^SE^/FlT-GLPSIPVHPI-GLPSIPVHPI- 
KASEKIFYV-SLLMW1TQC-KASEKIFYV-KASEKIFYV- 

RCGARGPESRLLEFYLAMPFATPMEAELARRSLAQDAPPLPVPGVLLKEFTVSGNILT 
IRLTAADHRQLQLSISSCLQQLSLLMWITQCFLPVFLAQPPSGQRR (SEQ ID NO. 20). 
[0172] SEQ ID NO. 21 is the polynucleotide encoding SEQ ID NO. 20 used in 

pBPL. 

[0173] A portion of SEQ ID NO. 20, 1KASEKIFYVSLLMWITQCKASEKIFYVK 
(SEQ ED NO. 22) was made as a synthetic peptide and subjected to in vitro proteasomal 
digestion analysis with human immunoproteasome, utilizing both mass spectrometry and N- 
terminal pool sequencing. The identification of a cleavage after the C residue indicates that 
this segment of the construct can function as a substrate or liberation sequence for NY-ESO- 
1 157-165 (SEQ ID NO. 12) epitope (see Figure 9). Figure 10 shows the differential processing 
of the SLLMWITQC epitope (SEQ ID NO. 12) in its native context where the cleavage 
following the C is more efficiently produced by housekeeping than immunoproteasome. The 
immunoproteasome also produces a major cleavage internal to the epitope, between the T and 
the Q when the epitope is in its native context, but not in the context of SEQ ID NO. 22 
(compare fig. 6 and 7). 
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Example 5 

[0174] Screening of further epitope arrays led to the identification of constructs 
promoting the expression of the epitope SSX-24i_49 (SEQ ID NO. 13). In addition to some of 
the array elements defined in Example 3, the following additional elements were also used: 

[0175] SSX-4 5 7_ 65 : VMTKLGFKV (SEQ ID NO. 23) Array element F. 

[01 76] PSMA730.739: RQIYVAAFTV (SEQ ID NO. 24) Array element G. 



[0177] A construct, denoted CTLA02, encoding an initiator methionine and the 
array F-A-G-D-C-F-G-A, was found to successfully immunize HLA-A2 transgenic mice to 
generate a CTL response recognizing the peptide SSX-24M9 (SEQ ID NO. 13). 

[0178] As described above, it can be desirable to combine a sequence with 
substrate or liberation sequence function with one that can be processed into immune 
epitopes. Thus SSX-2i 5 .i 8 3 (SEQ ID NO. 25) was combined with all or part of the array as 
follows: 



[0179] CTLS1 

[0180] CTLS2 

[0181] CTLS3 

[0182] CTLS4 



F-A-G-D-C-F-G-A- SSX-2 15 . 183 (SEQ ID NO. 26) 
SSX-2 15 . 183 - F-A-G-D-C-F-G-A (SEQ ID NO. 27) 
F-A-G-D- SSX-2 15 .i 83 (SEQ ID NO. 28) 

SSX-2i5_i 83 -C-F-G-A (SEQ ID NO. 29). 

[0183] All of the constructs except CTLS3 were able to induce CTL recognizing 
the peptide SSX-2 4M9 (SEQ ID NO. 13). CTLS3 was the only one of these four constructs 
which did not include the second element A from CTLA02 suggesting that it was this second 
occurrence of the element that provided substrate or liberation sequence function. In CTLS2 
and CTLS4 the A element is at the C-terminal end of the array, as in CTLA02. In CTLS1 the 
A element is immediately followed by the SSX-2 15.153 segment which begins with an 
alanine, a residue often found after proteasomal cleavage sites (Toes, R.E.M., et al., J. Exp. 
Med. 194:1-12, 2001). SEQ ID NO. 30 is the polynucleotide sequence encoding SEQ ID NO. 
26 used in CTLS1, also called pCBP. 
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[0184] A portion of CTLS1 (SEQ ID NO. 26), encompassing array elements F-A- 
SSX-2 15 _23 with the sequence RQIWAAFTV-A^S£A7F}T-AQIPEKIQK (SEQ ID NO. 31), 
was made as a synthetic peptide and subjected to in vitro proteasomal digestion analysis with 
human immunoproteasome, utilizing both mass spectrometry and N-terminal pool 
sequencing. The observation that the C-terminus of the SSX-2 41 _4 9 epitope (SEQ ID NO. 13) 
was generated (see Figure 1 1) provided further evidence in support of substrate or liberation 
sequence function. The data in Figure 12 showed the differential processing of the SSX-241. 
49 epitope, KASEKIFYV (SEQ ID NO. 13), in its native context, where the cleavage 
following the V was the predominant cleavage produced by housekeeping proteasome, while 
the immunoproteasome had several major cleavage sites elsewhere in the sequence. By 
moving this epitope into the context provided by SEQ ID NO. 31 the desired cleavage 
became a major one and its relative frequency compared to other immunoproteasome 
cleavages was increased (compare Figures 11 and 12). The data in Figure 11B also showed 
the similarity in specificity of mouse and human immunoproteasome lending support to the 
usefulness of the transgenic mouse model to predict human antigen processing. 

Example 6 

[0185] Screening also revealed substrate or liberation sequence function for a 
tyrosinase epitope, Tyr 2 o7-2i5 (SEQ ID NO. 32), as part of an array consisting of the sequence 
[Tyr M7 - Tyr 20 7-215]4, [MLLAVLYCLLWSFQTSA-FLPWHRLFL] 4 , (SEQ ID NO. 33). The 
same vector backbone described above was used to express this array. This array differs from 
those of the other examples in that the Tyrj.n segment, which was included as a source of 
immune epitopes, is used as a repeated element of the array. This is in contrast with the 
pattern shown in the other examples where sequence included as a source of immune 
epitopes and/or length occurred a single time at the beginning or end of the array, the 
remainder of which was made up of individual epitopes or shorter sequences. 
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Plasmid construction 

[0186] The polynucleotide encoding SEQ ID NO. 33 was generated by assembly 
of annealed synthetic oligonucleotides. Four pairs of complementary oligonucleotides were 
synthesized which span the entire coding sequence with cohesive ends of the restriction sites 
of Afl II and EcoR I at either terminus. Each complementary pair of oligonucleotides were 
first annealed, the resultant DNA fragments were ligated stepwise, and the assembled DNA 
fragment was inserted into the same vector backbone described above pre-digested with Afl 
n/EcoR I. The construct was called CTLT2/pMEL and SEQ ID NO. 34 is the polynucleotide 
sequence used to encode SEQ ID NO. 33. 

Example 7 

Administration of a DNA plasmid formulation of a immunotherapeutic for melanoma to 
humans. 

[0187] An MA2M melanoma vaccine with a sequence as described in Example 1 
above, was formulated in 1% Benzyl alcohol, 1% ethyl alcohol, 0.5mM EDTA, citrate- 
phosphate, pH 7.6. Aliquots of 200, 400, and 600 ng DNA/ml were prepared for loading into 
MDSflMED 407C infusion pumps. The catheter of a SILHOUETTE infusion set was placed 
into an inguinal lymph node visualized by ultrasound imaging. The pump and infusion set 
assembly was originally designed for the delivery of insulin to diabetics. The usual 17mm 
catheter was substituted with a 31mm catheter for this application. The infusion set was kept 
patent for 4 days (approximately 96 hours) with an infusion rate of about 25 nl/hour resulting 
in a total infused volume of approximately 2.4 ml. Thus the total administered dose per 
infusion was approximately 500, and 1000 jag; and can be 1500 jig, respectively, for the three 
concentrations described above. Following an infusion, subjects were given a 10 day rest 
period before starting a subsequent infusion. Given the continued residency of plasmid DNA 
in the lymph node after administration and the usual kinetics of CTL response following 
disappearance of antigen, this schedule will be sufficient to maintain the immunologic CTL 
response. 
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Example 8 

[0188] SEQ ID NO. 22 is made as a synthetic peptide and packaged with a 
cationic lipid protein transfer reagent. The composition is infused directly into the inguinal 
lymph node (see example 7) at a rate of 200 to 600 jig of peptide per day for seven days, 
followed by seven days rest. An initial treatment of 3-8 cycles are conducted. 

Example 9 

[0189] A fusion protein is made by adding SEQ ID NO. 34 to the 3' end of a 
nucleotide sequence encoding herpes simplex virus 1 VP22 (SEQ ID NO. 42) in an 
appropriate mammalian expression vector; the vector used above is suitable. The vector is 
used to transform HEK 293 cells and 48 to 72 hours later the cells are pelleted, lysed and a 
soluble extract prepared. The fusion protein is purified by affinity chromatagraphy using an 
anti-VP22 monoclonal antibody. The purified fusion protein is administered intranodally at a 
rate of 10 to 100 jig per day for seven days, followed by seven days rest. An initial treatment 
of 3-8 cycles are conducted. 

Examples 10-13 

[0190] The following examples, Examples 10-13, all concern the prediction of 
9-mer epitopes presented by HLA-A2.1, although the procedure is equally applicable to any 
HLA type, or epitope length, for which a predictive algorithm or MHC binding assay is 
available. 

Example 10 
Melan- A/MART- 1 (SEP ID NO: 2) 
[0191] This melanoma tumor-associated antigen (TuAA) is 118 amino acids in 
length. Of the 1 10 possible 9-mers, 16 are given a score >16 by the SYFPEITHI/Rammensee 
algorithm. (See Table 14). These represent 14.5% of the possible peptides and an average 
epitope density on the protein of 0.136 per amino acid. Twelve of these overlap, covering 
amino acids 22-49 of SEQ ID NO: 2 resulting in an epitope density for the cluster of 0.428, 
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giving a ratio, as described above, of 3.15. Another two predicted epitopes overlap amino 
acids 56-69 of SEQ ID NO: 2, giving an epitope density for the cluster of 0.143, which is not 
appreciably different than the average, with a ratio of just 1.05. See Figure 1. 



Table 14 

SYFPEITHI (Rammensee algorithm) Results for Melan- A/MART- 1 (SEP ID NO: 2) 



Rank 


Start 


Score 


1 


31 


27 


2 


56 


26 


3 


35 


26 


4 


32 


25 


5 


27 


25 


6 


29 


24 


7 


34 


23 


8 


61 


20 


9 


33 


19 


10 


22 


19 


11 


99 


18 


12 


36 


18 


13 


28 


18 


14 


87 


17 


15 


41 


17 


16 


40 


16 



[0192] Restricting the analysis to the 9-mers predicted to have a half time of 
dissociation of >5 minutes by the BIMAS-NIH/Parker algorithm leaves only 5. (See Table 
15). The average density of epitopes in the protein is now only 0.042 per amino acid. Three 
overlapping peptides cover amino acids 31-48 of SEQ ID NO: 2 and the other two cover 56- 
69 of SEQ ID NO: 2, as before, giving ratios of 3.93 and 3.40, respectively. (See Table 16). 



-106- 



Table 15 

BIMAS-NIH/Parker algorithm Results for Melan- A/MART- 1 (SEP ID NO: 2) 



Rank 


Start 


Score 


Log(Score) 


1 


40 


1289.01 


3.11 ! 


2 


56 


1055.104 


3.02 


3 


31 


81.385 


1.91 


4 


35 


20.753 


1.32 


5 


61 


4.968 


0.70 



Table 16 

Predicted Epitope Clusters for Melan- A/MART- 1 (SEP ID NO: 2) 







CalculationsfEoitODes/AAs) 






Cluster 


AA 


Peptides Cluster 


Whole protein 


Ratio 


1 


31-48 


3,4,1 0.17 


0.042 


3.93 


2 


56-69 


2,5 0.14 


0.042 


3.40 



Example 1 1 
SSX-2/HOM-MEL-40 (SEP ID NO: 40) 

[0193] This melanoma tumor-associated antigen (TuAA) is 188 amino acids in 
length. Of the 180 possible 9-mers, 1 1 are given a score >16 by the SYFPEITHI/Rammensee 
algorithm. These represent 6.1% of the possible peptides and an average epitope density on 
the protein of 0.059 per amino acid. Three of these overlap, covering amino acids 99-1 14 of 
SEQ ID NO: 40resulting in an epitope density for the cluster of 0.188, giving a ratio, as 
described above, of 3.18. There are also overlapping pairs of predicted epitopes at amino 
acids 16-28, 57-67, and 167-183 of SEQ ID NO: 40, giving ratios of 2.63, 3.11, and 2.01, 
respectively. There is an additional predicted epitope covering amino acids 5-28. Evaluating 
the region 5-28 of SEQ ID NO: 40 containing three epitopes gives an epitope density of 
0.125 and a ratio 2.14. 

[0194] Restricting the analysis to the 9-mers predicted to have a half time of 
dissociation of >5 minutes by the BIMAS-NIH/Parker algorithm leaves only 6. The average 
density of epitopes in the protein is now only 0.032 per amino acid. Only a single pair 
overlap, at 167-180 of SEQ ID NO: 40, with a ratio of 4.48. However the top ranked peptide 
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is close to another single predicted epitope if that region, amino acids 41-65 of SEQ ID NO: 
40, is evaluated the ratio is 2.51, representing a substantial difference from the average. See 
Figure 2. 



Table 17 

SYFPEITHI/Rammensee algorithm for SSX-2/HOM-MEL-40 (SEP ID NO: 40) 



Rank 


Start 


Score 


1 


103 


23 


2 


167 


22 


3 


41 


22 


4 


16 


21 


5 


99 


20 


6 


59 


19 


7 


20 


17 


8 


5 


17 


9 


175 


16 


10 


106 


16 


11 


57 


16 



Table 18 



CalculationsfEpitopes/AAs) (SEP ID NO: 40) 







Calculations(Epitopes/AAs) 






Cluster 


AA 


Peptides 


Cluster 


Whole protein 


Ratio 


1 


5 to 28 


8,4,7 


0.125 


0.059 


2.14 


2 


16-28 


4,7 


0.15 


0.059 


2.63 


3 


57-67 


11,6 


0.18 


0.059 


3.11 


4 


99-114 


5,1,10 


0.19 


0.059 


3.20 


5 


167-183 


2,9 


0.12 


0.059 


2.01 



Table 19 

BIMAS-NIH/Parker algorithm (SEP ID NO: 40) 



Rank 


Start 


Score 


Log(Score) 


1 


41 


1017.062 


3.01 


2 


167 


21.672 


1.34 


3 


57 


20.81 


1.32 


4 


103 


10.433 


1.02 ! 
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1 5 172 10.068 1.00 | 
I 6 16 6.442 0.81 | 



Table 20 

CalculationsfEpitopes/AAs) (SEP ED NO: 40) 



Cluster 


AA 


Peptides 


Cluster 


Whole protein Ratio 


1 


41-65 


1,3 


0.08 


0.032 2.51 


2 


167-180 


2,5 


0.14 


0.032 4.48 



Example 12 
NY-ESO (SEP ID NO: 11) 

[0195] This tumor-associated antigen (TuAA) is 180 amino acids in length. Of 
the 172 possible 9-mers, 25 are given a score >16 by the SYFPEITHI/Rammensee algorithm. 
Like Melan-A above, these represent 14.5% of the possible peptides and an average epitope 
density on the protein of 0.136 per amino acid. However the distribution is quite different. 
Nearly half the protein is empty with just one predicted epitope in the first 78 amino acids. 
Unlike Melan-A where there was a very tight cluster of highly overlapping peptides, in NY- 
ESO the overlaps are smaller and extend over most of the rest of the protein. One set of 19 
overlapping peptides covers amino acids 108-174 of SEQ ID NO: 11, resulting in a ratio of 
2.04. Another 5 predicted epitopes cover 79-104 of SEQ ID NO: 11, for a ratio of just 1.38. 

[0196] If instead one takes the approach of considering only the top 5% of 
predicted epitopes, in this case 9 peptides, one can examine whether good clusters are being 
obscured by peptides predicted to be less likely to bind to MHC. When just these predicted 
epitopes are considered we see that the region 108-140 of SEQ ID NO: 11 contains 6 
overlapping peptides with a ratio of 3.64. There are also 2 nearby peptides in the region 148- 
167 of SEQ ID NO: 1 1 with a ratio of 2.00. Thus the large cluster 108-174 of SEQ ID NO: 
1 1 can be broken into two smaller clusters covering much of the same sequence. 

[0197] Restricting the analysis to the 9-mers predicted to have a half time of 
dissociation of >5 minutes by the BIMAS-NIH/Parker algorithm brings 14 peptides into 
consideration. The average density of epitopes in the protein is now 0.078 per amino acid. A 
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single set of 10 overlapping peptides is observed, covering amino acids 144-171 of SEQ ID 
NO: 11, with a ratio of 4.59. All 14 peptides fall in the region 86-171 of SEQ ID NO: 11 
which is still 2.09 times the average density of epitopes in the protein. While such a large 
cluster is larger than we consider ideal it still offers a significant advantage over working 
with the whole protein. See Figure 3. 



-110- 



Table 21 

SYFPEITHI (Rammensee algorithm') Results for NY-ESO (SEP ID NO: 1 1) 



Rank 


Start 


Score 


1 


108 


25 


2 


148 


24 


3 


159 


21 


4 


127 


21 


5 


86 


21 


6 


132 


20 


7 


122 


20 


8 


120 


20 


9 


115 


20 


10 


96 


20 


11 


113 


19 


12 


91 


19 


13 


166 


18 


14 


161 


18 


15 


157 


18 


16 


151 


18 


17 


137 


18 


18 


79 


18 


19 


139 


17 


20 


131 


17 


21 


87 


17 


22 


152 


16 ! 


23 


144 


16 


24 


129 


16 


25 


15 


16 
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Table 22 



Calculations(Epitopes/AAs) (SEP ID NO: 1 1) 



Cluster 


AA 


Peptides 


Cluster 


Whole protein Ratio 


1 


108- 


1,9, 8, 7, 4,6 


0.18 


0.05 3.64 




140 








2 


148- 


2,3 


0.10 


0.05 2.00 




167 








3 


79- 


5 12, 10, 18,21 


0.19 


0.14 1.38 




104 








4 


108- 


1, 11,9, 8, 7, 4, 6, 17, 2, 16, 15,3, 


0.28 


0.14 2.04 




174 


14, 13,24, 20, 19, 23, 22 







Table 23 

BIMAS-NM/Parker algorithm Results for NY-ESO (SEP ID NO: 1 1) 



Rank 


Start 


Score 


Log(Score) 


1 


159 


1197.321 


3.08 


2 


86 


429.578 


2.63 


3 


120 


130.601 


2.12 


4 


161 


83.584 


1.92 


5 


155 


52.704 


1.72 


6 


154 


49.509 


1.69 


7 


157 


42.278 


1.63 


8 


108 


21.362 


1.33 


9 


132 


19.425 


1.29 


10 


145 


13.624 


1.13 


11 


163 


11.913 


1.08 


12 


144 


11.426 


1.06 


13 


148 


6.756 


0.83 


14 


152 


4.968 


0.70 



Table 24 

Calculations(Epitopes/AAs) (SEP ID NP: 1 11 



Cluster AA Peptides Cluster Whole protein Ratio 

1 86-171 2,8,3,9,10,12,13,14,6,5,7,1, 0.163 0.078 2.09 

4,11 
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2 144- 10,12,13,14,6,5,7,1,4,11 0.36 0.078 4.59 

171 



Example 13 
Tyrosinase (SEP ID NO: 3) 
[0198] This melanoma tumor-associated antigen (TuAA) is 529 amino acids in 
length. Of the 521 possible 9-mers, 52 are given a score >16 by the SYFPEITHI/Rammensee 
algorithm. These represent 10% of the possible peptides and an average epitope density on 
the protein of 0.098 per amino acid. There are 5 groups of overlapping peptides containing 2 
to 13 predicted epitopes each, with ratios ranging from 2.03 to 4.41, respectively. There are 
an additional 7 groups of overlapping peptides, containing 2 to 4 predicted epitopes each, 
with ratios ranging from 1.20 to 1.85, respectively. The 17 peptides in the region 444-506 of 
SEQ ID NO: 3, including the 13 overlapping peptides above, constitutes a cluster with a ratio 
of2.20. 

[0199] Restricting the analysis to the 9-mers predicted to have a half time of 
dissociation of >5 minutes by the BIMAS-NIH/Parker algorithm brings 28 peptides into 
consideration. The average density of epitopes in the protein under this condition is 0.053 
per amino acid. At this density any overlap represents more than twice the average density of 
epitopes. There are 5 groups of overlapping peptides containing 2 to 7 predicted epitopes 
each, with ratios ranging from 2.22 to 4.9, respectively. Only three of these clusters are 
common to the two algorithms. Several, but not all, of these clusters could be enlarged by 
evaluating a region containing them and nearby predicted epitopes. 
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Table 25 

SYFPEITHl/Rammensee algorithm Results for Tyrosinase (SEP ID NO: 3) 



Rank 



Start Score 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 



490 
491 
487 

1 

2 
482 
380 
369 
214 
506 
343 
207 
137 
57 
169 
118 

9 
488 
483 
480 
479 
478 
473 
365 
287 
200 



34 
31 
28 
27 
25 
23 
23 
23 
23 
22 
22 
22 
22 
22 
20 
20 
20 
19 
19 
19 
19 
19 
19 
19 
19 
19 



Rank 


Start 


Score 


27 


5 


19 


28 


484 


18 


29 


476 


18 


30 


463 


18 


31 


444 


18 


32 


425 


18 


33 


316 


18 


34 


187 


18 


35 


402 


17 


36 


388 


17 


37 


346 


17 


38 


336 


17 


39 


225 


17 


40 


224 


17 


41 


208 


17 


42 


186 


17 


43 


171 


17 


44 


514 


16 


45 


494 


16 


46 


406 


16 


47 


385 


16 


48 


349 


16 


49 


184 


16 


50 


167 


16 


51 


145 


16 


52 


139 


16 



Table 26 

Calculations(Epitopes/AAs) (SEP ID NO: 3) 



Cluster 


AA 


Peptides 


Cluster 


Whole protein 


Ratio 


1 


1 to 17 


4, 5, 27, 17 


0.24 


0.098 


2.39 


2 


137- 


13,52,51 


0.18 


0.098 


1.80 




153 










3 


167- 


15,43,50 


0.23 


0.098 


2.35 




179 










4 


184- 


34, 42, 49 


0.25 


0.098 


2.54 




195 










5 


200- 


26,41,9, 12 


0.17 


0.098 


1.77 
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222 


6 


224- 
233 


39, 40 


0.20 


0.098 


2.03 


7 


336- 
357 


38, 11,37, 48 


0.18 


0.098 


1.85 


8 


365- 
377 


24,8 


0.15 


0.098 


1.57 


9 


380- 
396 


7, 47, 36 


0.18 


0.098 


1.80 


10 


402- 
414 


: 35,46 


0.15 


0,098 1 


1.57 


11 


473- 
502 


29, 28, 23, 22,21,20, 6, 19, 3, 18, 
1,2, 45 


0.43 


0.098 


4.41 


12 


506- 
522 


10, 44 


0.12 


0.098 


1.20 




444- 


31,30,23,29, 22,21,20,6, 19, 


0.22 


0.098 


2.20 




522 


28,3, 18, 1,2, 45, 10,44 
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Table 27 

BIMAS-NIH/Parker algorithm Results (SEP ID NO: 3) 



Rank 


Start 


Score 


Log(Score) 


1 


207 


540.469 


2.73 


2 


369 


531.455 


2.73 


3 


1 


309.05 


2.49 


4 


9 


266.374 


2.43 


5 


490 


181.794 


2.26 


6 


214 


177.566 


2.25 


7 


224 


143.451 


2.16 


8 


171 


93.656 


1.97 


9 


506 


87.586 


1.94 


10 


487 


83.527 


1.92 


11 


491 


83.527 


1.92 


12 


2 


54.474 


1.74 


13 


137 


47.991 


1.68 


14 


200 


30.777 


1.49 


15 


208 


26.248 


1.42 


16 


460 


21.919 


1.34 


17 


478 


19.425 


1.29 


18 


365 


17.14 


1.23 


19 


380 


16.228 


1.21 


20 


444 


13.218 


1.12 


21 


473 


13.04 


1.12 


22 


57 


10.868 


1.04 


23 


482 


8.252 


0.92 


24 


483 


7.309 


0.86 


25 


5 


6.993 


0.84 


26 


225 


5.858 


0.77 


27 


343 


5.195 


0.72 


28 


514 


5.179 


0.71 
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Table 28 



Calculations(Epitopes/AAs) (SEP ID NO: 3) 



Cluster 


AA 


Peptides 


Cluster 


Whole protein 


Ratio 


1 


1 to 17 


3, 12,25,4 


0.24 


0.053 


4.45 


2 


200- 
222 


14, 1, 15,6 


0.17 


0.053 


3.29 


3 


224- 
233 


7, 26 


0.20 


0.053 


3.78 


4 


365- 
377 


18,2 


0.15 


0.053 


2.91 


5 


473- 
499 


21, 17, 23,24, 10, 5, 11 


0.26 


0.053 


4.90 


6 


506- 
522 


9,28 


0.12 


0.053 


2.22 


7 


365- 
388 


18, 2, 19 


0.13 


0.053 


2.36 


8 


444- 
499 


20, 16,21, 17, 23, 24, 10, 5, 11 


0.16 


0.053 


3.03 


9 


444- 
522 


20, 16,21, 17, 23,24, 10,5,11,9, 
28 


0.14 


0.053 


2.63 


10 


200- 
233 


14, 1, 15, 6, 7, 26 


0.18 


0.053 


3.33 



[0200] All references mentioned herein are hereby incorporated by reference in 
their entirety. Further, the present invention can utilize various aspects of the following, 
which are all incorporated by reference in their entirety: U.S. Patent Application Nos. 
09/380,534, filed on September 1, 1999, entitled A METHOD OF INDUCING A CTL 
RESPONSE; 09/776,232, filed on February 2, 2001, entitled METHOD OF INDUCING A 
CTL RESPONSE; 09/715,835, filed on November 16, 2000, entitled AVOIDANCE OF 
UNDESIRABLE REPLICATION INTERMEDIATES IN PLASMID PROPOGATION; 
09/999,186, filed on November 7, 2001, entitled METHODS OF COMMERCIALIZING AN 
ANTIGEN; and Provisional U.S. Patent Application No 60/274,063, filed on March 7, 2001, 
entitled ANTI-NEO VASCULAR VACCINES FOR CANCER. 
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Table 29 
Partial listing of SEP ID NOS. 



1 


ELAGIGILTV 


melan- A 26-35 (A27L) | 


2 


Melan -A protein 


Accession number: 
NP 005502 


3 


Tyrosinase protein 


Accession number: 
P14679 


4 


MLLAVLYCLELAGIGILTVYM 

DGTMSQVGILTVILGVLLLIGC 

WYCRRRNGYRALMDKSLHVG 

TQCALTRRCPQEGFDHRDSKV 

SLQEKNCEPV 


pMA2M expression 
product 


5 


MLLAVLYCLELAGIGILTVYM 
DGTMSQV 


Liberation or substrate 
sequence for SEQ ID ! 
NO. 1 

from pMA2M 


6 


MLLAVLYCL 


tyrosinase 1-9 


7 


YMDGTMSQV 


tyrosinase 369-377 


8 


EAAGIGILTV 


melan- A 26-35 


9 


cttaagccaccatgttactagctgttttgtactgcctggaac 

tagcagggatcggcatattgacagtgtatatgga 

tggaacaatgtcccaggtaggaattctgacagtgatcctggg 

agtcttactgctcatcggctgttggtattgtaga 

agacgaaatggatacagagccttgatggataaaagtcttcat 

gttggcactcaatgtgccttaacaagaagatgcc 

cacaagaagggtttgatcatcgggacagcaaagtgtctcttc 

aagagaaaaac tgtgaacctgtgtagtgagcggc 

cgc 


pMA2M insert 


10 


MVLYCLELAGIGILTVYMDGT 
AVLYCLELAGIGILTVYMDGT 
MLA VL YCLELAGIG1LT V YMD 
GTMSLLAVLYCLELAGIGILTV 


Epitope array from 
pVAXM2 and 
pVAXMl 


11 


NY-ESO-1 protein 


Accession number: 
P78358 


12 


SLLMWITQC 


NY-ESO-1 157-165 


13 


KASEKIFYV 


SSX-2 41-49 


14 


TQCFLPVFL 


NY-ESO-1 163-171 
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15 


GLPSIPVHPI 


PSMA 288-297 


16 


AVLYCL 


tyrosinase 4-9 


17 


MSLLMWITQCKASEKIFYVRCGARGPES 
RLLEF YLAMPF ATPME AEL ARRS L AQD A 
PPLPVPGVLLKEFTVSGNILTIRLTAADHR 
QLQLSISSCLQQLSLLMWITQCFLPVFLAQ 
PPSGQRR 


pN157 expression 
product 


18 


MSLLMWITQCKASEKIFYV 


liberation or substrate 
sequence for SEQ ID 
NO. 12frompN157 


19 


cttaagccaccatgtccctgttgatgtggatcacgcagtgca 

aagcttcggagaaaatcttctacgtacggtgcgg 

tgccagggggccggagagccgcctgcttgagttctacctcgc 

catgcctttcgcgacacccatggaagcagagctg 

gcccgcaggagcctggcccaggatgccccaccgcttcccgtg 

ccaggggtgcttctgaaggagttcactgtgtccg 

gcaacatactgactatccgactgactgctgcagaccaccgcc 

aactgcagctctccatcagctcctgtctccagca 

gctttccctgttgatgtggatcacgcagtgctttctgcccgt 

gtttttggctcagcctccctcagggcagaggcgc 

tagtgagaattc 


Insert for pN 157 


20 


MSLLMWITQCKASEKIFYVGLPSIPVHPIG 

LPSIPVHPIKASEKIFYVSLLMWITQCKAS 

EKIFYVKASEKIFYVRCGARGPESRLLEFY 

LAMPF ATPME AELARRSLAQDAPPLP VP 

GVLLKEFTVSGNELTIRLTAADHRQLQLSI 

SSCLQQLSLLMWITQCFLPVFLAQPPSGQ 

RR 


pBPL expression 
proauct 


21 


atgtccctgttgatgtggatcacgcagtgcaaagcttcggag 

aaaatcttctatgtgggtcttccaagtattcctg 

ttcatccaattggtcttccaagtattcctgttcatccaatta 

aagcttcggagaaaatcttctatgtgtccctgtt 

gatgtggatcacgcagtgcaaagcttcggagaaaatcttcta 

tgtgaaagcttcggagaaaatcttctacgtacgg 

tgcggtgccagggggccggagagccgcctgcttgagttctac 

agctggcccgcaggagcctggcccaggatgccccaccgcttc 
ccataeeaaaaatacttctaaaaaaatt cactat 

\^ \Jl ^-j \nr W U-i. ^ y y y *■» V- W W I* y U- \-A ^ L~ \* K**. \*» W 

gtccggcaacatactgactatccgactgactgctgcagacca 

ccgccaactgcagctctccatcagctcctgtctc 

cagcagctttccctgttgatgtggatcacgcagtgctttctg 

ccccrtatttttaactcaacctccctcaaaacaaa 

ggcgctagtga 


pBPL insert coding 
region 


22 


DCASEKIFYVSLLMWITQCKASEKIFYVK 


Substrate in Fig. 9 


23 


VMTKLGFKV 


SSX-4 57 . 65 


24 


RQIYVAAFTV 


PSMA730-739 


25 


AQEPEKIQKAFDDIAKYFSKEEWEKMKAS 

EKIFYVYMKRKYEAMTKLGFKATLPPFM 

CNKRAEDFQGNDLDNDPNRGNQVERPQ 

MTFGRLQGISPKIMPKKPAEEGNDSEEVP 

EASGPQNDGKELCPPGKPTTSEKIHERSG 


SSX-2i5.i83 
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PKRGEHAWTHRLRERKQLVIYEEISDP 




26 


MVMTKLGFKVKASEKIFYVRQIYVAAFT 
V 

GLPSIPVHPITQCFLPVFLVMTKLGFKVRQ 

IYVAAFTVKASEKIFYVAQIPEKIQKAFDD 

IAK YF S KEE WEKMKAS EKIF YVYMKRKY 

EAMTKLGFKATLPPFMCNKRAEDFQGND 

LDNDPNRGNQVERPQMTFGRLQGISPKI 

MPKKPAEEGNDSEEVPEASGPQNDGKEL 

CPPGKPTTSEKIHERSGPKRGEHAWTHRL 

RERKQLVIYEEISDP 


CTLSl/pCBP 
expression product 


27 


MAQIPEKIQKAFDDIAKYFSKEEWEKMK 

ASEKIFYVYMKRKYEAMTKLGFKATLPP 

FMCNKRAEDFQGNDLDNDPNRGNQVER 

PQMTFGRLQGISPKIMPKKPAEEGNDSEE 

VPE AS GPQNDGKELCPPGKPTTSEKIHER 

SGPKRGEHAWTHRLRERKQLVIYEEISDP 

VMTKLGFKVKASEKIFYVRQIYVAAFTV 

GLPSIPVHPITQCFLPVFLVMTKLGFKVRQ 

IYVAAFTVKASEKIFYV 


CTLS2 expression 
product 


28 


MVMTKLGFKVKASEKIFYVRQIYVAAFT 
V 

GLPSIPVHPIAQIPEKIQKAFDDIAKYFSKE 

EWEKMKASEKIFYVYMKRKYEAMTKLG 

FKATLPPFMCNKRAEDFQGNDLDNDPNR 

GNQVERPQMTFGRLQGISPKIMPKKPAEE 

GNDSEEVPEASGPQNDGKELCPPGKPTTS 

EKIHERSGPKRGEHAWTHRLRERKQLVIY 

EEISDP 


CTLS3 expression 
product 


29 


MAQIPEKIQKAFDDIAKYFSKEEWEKMK 

ASEKIFYVYMKRKYEAMTKLGFKATLPP 

FMCNKRAEDFQGNDLDNDPNRGNQVER 

PQMTFGRLQGISPKIMPKKPAEEGNDSEE 

VPEASGPQNDGKELCPPGKPTTSEKIHER 

SGPKRGEHAWTHRLRERKQLVIYEEISDP 

TQCFLPVFLVMTKLGFKVRQIYVAAFTV 

KASEKIFYV 


CTLS4 expression 
product 


30 


atggtcatgactaaactaggtttcaaggtcaaagcttcggag 

aaaatcttctatgtgagacagatttatgttgcag 

ccttcacagtgggtcttccaagtattcctgttcatccaatta 

cgcagtgctttctgcccgtgtttttggtcatgac 

taaactaggtttcaaggtcagacagatttatgttgcagcctt 

cacagtgaaagcttcggagaaaatcttctacgta 

gctcaaataccagagaagatccaaaaggccttcgatgatatt 

gccaaatacttctctaaggaagagtgggaaaaga 

tgaaagcctcggagaaaatcttctatgtgtatatgaagagaa 

agtatgaggctatgactaaactaggtttcaaggc 

caccctcccacctttcatgtgtaataaacgggccgaagactt 

ccaggggaatgatttggataatgaccctaaccgt 


pCBP insert coding 
region 
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gggaatcaggttgaacgtcctcagatgactttcggcaggctc 

cagggaatctccccgaagatcatgcccaagaagc 

cagcagaggaaggaaatgattcggaggaagtgccagaagcat 

ctggcccacaaaatgatgggaaagagctgtgccc 

cccgggaaaaccaactacctctgagaagattcacgagagatc 

tggacccaaaaggggggaacatgcctggacccac 

agactgcgtgagagaaaacagctggtgatttatgaagagatc 

agcgacccttagtga 




31 


RQIYVAAFTVKASEKIFYVAQIPEKIQK 


Fig. 1 1 substrate/ 
CTLS1-2 


32 


FLPWHRLrL 


1 1 K.207-215 


33 


MLLAVLYCLLWSFQTSAFLPWHRLFLML 
LAVLYCLLWSFQTSAFLPWHRLFLMLLA 
VLYCLLWSFQTSAFLPWHRLFLMLLAVL 
YCLLWSFQTSAFLPWHRLFL 


CTLT2/pMEL 
expression product 


34 


atgctcctggctgttttgtactgcctgctgtggagtttccag 

acctccgcttttctgccttggcatagactcttct 

tgatgctcctggctgttttgtactgcctgctgtggagtttcc 

agacctccgcttttctgccttggcatagactctt 

cttgatgctcctggctgttttgtactgcctgctgtggagttt 

ccagacctccgcttttctgccttggcatagactc 

ttcttgatgctcctggctgttttgtactgcctgctgtggagt 

ttccagacctccgcttttctgccttggcatagac 

tcttcttgtagtga 


CTLT2/pMEL insert 
coding region 








35 


MELAN-A cDNA 


A " 1 

Accession number: 
NM 005511 


36 


Tyrosinase cDNA 


A * 1 

Accession number: 
NM 000372 


37 


NY-ESO-l cDNA 


A • 1 

Accession number: 
U87459 


38 


PSMA protein 


A • 1 

Accession number: 
NP 004467 


39 


PSMA cDNA 


Accession number: 
NM 004476 


40 


SSX-2 protein 


Accession number: 
NP 003138 


41 


SSX-2 cDNA 


Accession number: 
NM 003147 


42 


atgacctctcgccgctccgtgaagtcgggtccgcgggaggttcc 

gcgcgatgagtacgaggatctgtactacaccccgtcttcaggtat 

ggcgagtcccgatagtccgcctgacacctcccgccgtggcgcc 

ctacagacacgctcgcgccagaggggcgaggtccgtttcgtcca 

gtacgacgagtcggattatgccctctacgggggctcgtcatccga 


From accession number: 
D10879 

Herpes Simplex virus 1 
UL49 coding sequence 
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agacgacgaacacccggaggtcccccggacgcggcgtcccgt 

ttccggggcggttttgtccggcccggggcctgcgcgggcgcctc 

cgccacccgctgggtccggaggggccggacgcacacccacca 

ccgccccccgggccccccgaacccagcgggtggcgactaagg 

cccccgcggccccggcggcggagaccacccgcggcaggaaa 

tcggcccagccagaatccgccgcactcccagacgcccccgcgt 

cgacggcgccaacccgatccaagacacccgcgcaggggctgg 

ccagaaagctgcactttagcaccgcccccccaaaccccgacgc 

gccatggaccccccgggtggccggctttaacaagcgcgtcttct 

gcgccgcggtcgggcgcctggcggccatgcatgcccggatgg 

cggcggtccagctctgggacatgtcgcgtccgcgcacagacga 

agacctcaacgaactccttggcatcaccaccatccgcgtgacgg 

tctgcgagggcaaaaacctgcttcagcgcgccaacgagttggtg 

aatccagacgtggtgcaggacgtcgacgcggccacggcgactc 

gagggcgttctgcggcgtcgcgccccaccgagcgacctcgagc 

cccagcccgctccgcttctcgccccagacggcccgtcgag 


(VP22) 


43 


MTSRRSVKSGPREVPRDEYEDLYYTPSSG 

MASPDSPPDTSRRGALFTQTRSRQRGEVR 

FVQYDESDYALYGGSSSEDDEHPEVPRT 

RRPVSGAVLSGPGPARAPPPFTPAGSGGA 

GRTPTTAPRAPRTQRVATKAPAAPAAET 

TRGRKSAQPESAALPDAPASTAPTFTRSK 

TPAQGLARKLHFSTAPPNPDAPWTPRVA 

GFNKRVFCAAVGRLAAMHARMAAVQL 

WDFTMSRPRTDEDLNELLGITTIRVTVCE 

GKNLLQRANELVNPDWQDVDAATATR 

GRSAASRFTPTERPRAPARSASRPRRPVE 


Accession number: 
P10233 

Herpes Simplex virus 1 
UL49/VP22 protein 
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Melan-A mRNA sequence 

LOCUS NM 005511 1524 bp mRNA PRI 14-OCT-2001 
DEFINITION Homo sapiens melan-A (MLANA), mRNA. 
ACCESSION NM 005511 
VERSION NM_005511.1 GL5031912 

(SEQ ID NO. 2) 

/translation='MPREDAHFIYGYPKXGHGHSYTTAEEAAGIGILTVILGVLLLIGC 

RRNGYRALMDKSLHVGT^ 

KLSAEQSPPPYSP" 

(SEQ ID NO. 35) 
ORIGIN 

1 agcagacaga ggactctcat taaggaaggt gtcctgtgcc ctgaccctac aagatgccaa 
61 gagaagatgc tcacttcatc tatggttacc ccaagaaggg gcacggccac tcttacacca 
121 cggctgaaga ggccgctggg atcggcatcc tgacagtgat cctgggagtc ttactgctca 
181 tcggctgttg gtattgtaga agacgaaatg gatacagagc cttgatggat aaaagtcttc 
241 atgttggcac tcaatgtgcc ttaacaagaa gatgcccaca agaagggttt gatcatcggg 
301 acagcaaagt gtctcttcaa gagaaaaact gtgaacctgt ggttcccaat gctccacctg 
361 cttatgagaa actctctgca gaacagtcac caccacctta ttcaccttaa gagccagcga 
421 gacacctgag acatgctgaa attatttctc tcacactttt gcttgaattt aatacagaca 
481 tctaatgttc tcctttggaa tggtgtagga aaaatgcaag ccatctctaa taataagtca 
541 gtgttaaaat tttagtaggt ccgctagcag tactaatcat gtgaggaaat gatgagaaat 
601 attaaattgg gaaaactcca tcaataaatg ttgcaatgca tgatactatc tgtgccagag 
661 gtaatgttag taaatccatg gtgttatttt ctgagagaca gaattcaagt gggtattctg 
721 gggccatcca atttctcttt acttgaaatt tggctaataa caaactagtc aggttttcga 
781 accttgaccg acatgaactg tacacagaat tgttccagta ctatggagtg ctcacaaagg 
841 atacttttac aggttaagac aaagggttga ctggcctatt tatctgatca agaacatgtc 
901 agcaatgtct ctttgtgctc taaaattcta ttatactaca ataatatatt gtaaagatcc 
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961 tatagctctt tttttttgag atggagtttc gcttttgttg cccaggctgg agtgcaatgg 
1021 cgcgatcttg gctcaccata acctccgcct cccaggttca agcaattctc ctgccttagc 
1081 ctcctgagta gctgggatta caggcgtgcg ccactatgcc tgactaattt tgtagtttta 
1 141 gtagagacgg ggtttctcca tgttggtcag gctggtctca aactcctgac ctcaggtgat 
1201 ctgcccgcct cagcctccca aagtgctgga attacaggcg tgagccacca cgcctggctg 
1261 gatcctatat cttaggtaag acatataacg cagtctaatt acatttcact tcaaggctca 
1321 atgctattct aactaatgac aagtattttc tactaaacca gaaattggta gaaggattta 
1381 aataagtaaa agctactatg tactgcctta gtgctgatgc ctgtgtactg ccttaaatgt 
1441 acctatggca atttagctct cttgggttcc caaatccctc tcacaagaat gtgcagaaga 
1501 aatcataaag gatcagagat tctg 



Tyrosinase mRNA sequence 



LOCUS NM_000372 1964 bp mRNA PRI 31-OCT-2000 
DEFINITION Homo sapiens tyrosinase (oculocutaneous albinism IA) (TYR), mRNA. 
ACCESSION NM_000372 
VERSION NM_000372.1 GI:4507752 



(SEQ ID NO. 3) 

/translation="MLLAVLYCLLWSFQTSAGHFPRACVSSKNLMEKECCPPWSGDRS 

PCGQLSGRGSCQNILLSNAPLGPQFPFTGVDDRESWPSVFYNRTCQCSGNFMGFNCG 

NCKFGFWGPNCTERRLLVRRNIFDLSAPEKDKFFAYLTLAKHTISSDYVIPIGTYGQM 

KNGSTPMFNDINIYDLFVWMHYYVSMDALLGGSEIWRDIDF AHEAP AFLPWHRLFLL 

RWEQEIQKLTGDENFTIPYWDWRDAEKCDICTDEYMGGQHPTNPNLLSPASFFSSW 

QIVCSRLEEYNSHQSLCNGTPEGPLRRNPGNHDKSRTPRLPSSADVEFCLSLTQYESG 

SMDKAANFSFPJsFTLEGFASPLTGIADASQSSMHNALHIYMNGTMSQVQGSANDPIFL 

LHHAF\TJ)SIFEQWLRRHRPLQEVYPEANAPIGH1SIRESYMVPF1PLYRNGDFFISSKDL 

GYDYSYLQDSDPDSFQDYIKSYLEQASRIWSWLLGAAMVGAVLTALLAGLVSLLCR 

HKRKQLP EEKQPLLMEKEDYHSLYQSHL" 
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(SEQ ID NO. 36) 
ORIGIN 

1 atcactgtag tagtagctgg aaagagaaat ctgtgactcc aattagccag ttcctgcaga 
61 ccttgtgagg actagaggaa gaatgctcct ggctgttttg tactgcctgc tgtggagttt 
121 ccagacctcc gctggccatt tccctagagc ctgtgtctcc tctaagaacc tgatggagaa 
181 ggaatgctgt ccaccgtgga gcggggacag gagtccctgt ggccagcttt caggcagagg 
241 ttcctgtcag aatatccttc tgtccaatgc accacttggg cctcaatttc ccttcacagg 
301 ggtggatgac cgggagtcgt ggccttccgt cttttataat aggacctgcc agtgctctgg 
361 caacttcatg ggattcaact gtggaaactg caagtttggc ttttggggac caaactgcac 
421 agagagacga ctcttggtga gaagaaacat cttcgatttg agtgccccag agaaggacaa 
481 attttttgcc tacctcactt tagcaaagca taccatcagc tcagactatg tcatccccat 
541 agggacctat ggccaaatga aaaatggatc aacacccatg tttaacgaca tcaatattta 
601 tgacctcttt gtctggatgc attattatgt gtcaatggat gcactgcttg ggggatctga 
661'aatctggaga gacattgatt ttgcccatga agcaccagct tttctgcctt ggcatagact 
721 cttcttgttg cggtgggaac aagaaatcca gaagctgaca ggagatgaaa acttcactat 
781 tccatattgg gactggcggg atgcagaaaa gtgtgacatt tgcacagatg agtacatggg 
841 aggtcagcac cccacaaatc ctaacttact cagcccagca tcattcttct cctcttggca 
901 gattgtctgt agccgattgg aggagtacaa cagccatcag tctttatgca atggaacgcc 
961 cgagggacct ttacggcgta atcctggaaa ccatgacaaa tccagaaccc caaggctccc 
1021 ctcttcagct gatgtagaat tttgcctgag tttgacccaa tatgaatctg gttccatgga 
1081 taaagctgcc aatttcagct ttagaaatac actggaagga tttgctagtc cacttactgg 
1141 gatagcggat gcctctcaaa gcagcatgca caatgccttg cacatctata tgaatggaac 
1201 aatgtcccag gtacagggat ctgccaacga tcctatcttc cttcttcacc atgcatttgt 
1261 tgacagtatt tttgagcagt ggctccgaag gcaccgtcct cttcaagaag tttatccaga 
1321 agccaatgca cccattggac ataaccggga atcctacatg gttcctttta taccactgta 
1381 cagaaatggt gatttcttta tttcatccaa agatctgggc tatgactata gctatctaca 
1441 agattcagac ccagactctt ttcaagacta cattaagtcc tatttggaac aagcgagtcg 
1501 gatctggtca tggctccttg gggcggcgat ggtaggggcc gtcctcactg ccctgctggc 
1561 agggcttgtg agcttgctgt gtcgtcacaa gagaaagcag cttcctgaag aaaagcagcc 
1621 actcctcatg gagaaagagg attaccacag cttgtatcag agccatttat aaaaggctta 



-125- 



1681 ggcaatagag tagggccaaa aagcctgacc tcactctaac tcaaagtaat gtccaggttc 
1741 ccagagaata tctgctggta tttttctgta aagaccattt gcaaaattgt aacctaatac 
1801 aaagtgtagc cttcttccaa ctcaggtaga acacacctgt ctttgtcttg ctgttttcac 
1861 tcagcccttt taacattttc ccctaagccc atatgtctaa ggaaaggatg ctatttggta 
1921 atgaggaact gttatttgta tgtgaattaa agtgctctta tttt 

NY-ESO-1 mRNA sequence 

LOCUS HSU87459 752 bp mRNA PRI 22-DEC-1999 
DEFINITION Human autoimmunogenic cancer/testis antigen NY-ESO-1 mRNA, complete 
cds. 

ACCESSION U87459 

VERSION U87459.1 GI: 1890098 

(SEQ ID NO. 11) 

/translation="MQAEGRGTGGSTGDADGPGGPGIPDGPGGNAGGPGEAGATGGRGPRG 
AGAARASGPGGGAPRGPHGGAASGLNGCCRCGARGPESRLLEFYLAMPFATPMEAE 
LARRSLAQDAPPLPVPGVLLKEFTVSGNILTIRLTAADHRQLQLSISSCLQQLSLLM 
WITQCFLPVFLAQPPSGQRR" 

(SEQ ID NO. 37) 
ORIGIN 

1 atcctcgtgg gccctgacct tctctctgag agccgggcag aggctccgga gccatgcagg 
61 ccgaaggccg gggcacaggg ggttcgacgg gcgatgctga tggcccagga ggccctggca 
121 ttcctgatgg cccagggggc aatgctggcg gcccaggaga ggcgggtgcc acgggcggca 
181 gaggtccccg gggcgcaggg gcagcaaggg cctcggggcc gggaggaggc gccccgcggg 
241 gtccgcatgg cggcgcggct tcagggctga atggatgctg cagatgcggg gccagggggc 
301 cggagagccg cctgcttgag ttctacctcg ccatgccttt cgcgacaccc atggaagcag 
361 agctggcccg caggagcctg gcccaggatg ccccaccgct tcccgtgcca ggggtgcttc 
421 tgaaggagtt cactgtgtcc ggcaacatac tgactatccg actgactgct gcagaccacc 
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481 gccaactgca gctctccatc agctcctgtc tccagcagct ttccctgttg atgtggatca 
541 cgcagtgctt tctgcccgtg tttttggctc agcctccctc agggcagagg cgctaagccc 
601 agcctggcgc cccttcctag gtcatgcctc ctcccctagg gaatggtccc agcacgagtg 
661 gccagttcat tgtgggggcc tgattgtttg tcgctggagg aggacggctt acatgtttgt 
721 ttctgtagaa aataaaactg agctacgaaa aa 

PSMA cDNA sequence 

LOCUS NM_004476 2653 bp mRNA PRI 01-NOV-2000 
DEFINITION Homo sapiens folate hydrolase (prostate-specific membrane antigen) 

1 (FOLH1), mRNA. 
ACCESSION NM_004476 
VERSION NM_004476.1 GL4758397 

(SEQ ID NO. 38) 

/translation="MWNLLHETDSAVATARRPRWLCAGALVLAGGFFLLGFLFGWFIKSSNE 

ATMTPKHNMKAFLDELKAENKKFLYNFTQIPHLAGTEQNFQLAKQIQSQWKEFGL 

DSVELAHYDVLLSYPNKTHPNYISIINEDGNEIFNTSLFEPPPPGYENVSDrVPPFSAFSP 

QGMPEGDLVYVNYARTEDFFKLERDMKINCSGKIVIARYGKVFRGNKVKNAQLAG 

AKGVILYSDPADYFAPGVKSYPDGWNLPGGGVQRGNILNLNGAGDPLTPGYPANEY 

AYRRGIAEAVGLPSIPVHPIGYYDAQKLLEKMGGSAPPDSSWRGSLKVPYNVGPGFT 

GNFSTQKVKMHfflSTNEVTRTYNVIGTLRGAVEPDRYVILGGHRDSWVFGGIDPQSG 

AAVVHEIVRSFGTLKKEGWRPRRTILFASWDAEEFGLLGSTEWAEENSRLLQERGVA 

YINADSSIEGNYTLRVDCTPLMYSLVHNLTKELKSPDEGFEGKSLYESWTKKSPSPEF 

SGMPmSKLGSGNDFEVFFQRLGIASGRARYTKNWETNKFSGYPLYHSVYETYELVE 

KF YDP MFK YHLT V AQ VRGGM VFELAN S IVLPFD CRD Y A VVLRK Y ADKIYS ISMKHP 

QEMKTYSVSFDSLFSAVKNFTEIASKFSERLQDFDKSNPIVLRMMNDQLMFLERAFID 

PLGLPDRPFYRHVIYAPSSHNKYAGESFPGIYDALFDffiSK\0)PSKAWGEVKRQIYVA 

AFTVQAAAETLSEVA" 
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(SEQ ID NO. 39) 
ORIGIN 

1 ctcaaaaggg gccggatttc cttctcctgg aggcagatgt tgcctctctc tctcgctcgg 
61 attggttcag tgcactctag aaacactgct gtggtggaga aactggaccc caggtctgga 
121 gcgaattcca gcctgcaggg ctgataagcg aggcattagt gagattgaga gagactttac 
181 cccgccgtgg tggttggagg gcgcgcagta gagcagcagc acaggcgcgg gtcccgggag 
241 gccggctctg ctcgcgccga gatgtggaat ctccttcacg aaaccgactc ggctgtggcc 
301 accgcgcgcc gcccgcgctg gctgtgcgct ggggcgctgg tgctggcggg tggcttcttt 
361 ctcctcggct tcctcttcgg gtggtttata aaatcctcca atgaagctac taacattact 
421 ccaaagcata atatgaaagc atttttggat gaattgaaag ctgagaacat caagaagttc 
481 ttatataatt ttacacagat accacattta gcaggaacag aacaaaactt tcagcttgca 
541 aagcaaattc aatcccagtg gaaagaattt ggcctggatt ctgttgagct agcacattat 
601 gatgtcctgt tgtcctaccc aaataagact catcccaact acatctcaat aattaatgaa 
661 gatggaaatg agattttcaa cacatcatta tttgaaccac ctcctccagg atatgaaaat 
721 gtttcggata ttgtaccacc tttcagtgct ttctctcctc aaggaatgcc agagggcgat 
781 ctagtgtatg ttaactatgc acgaactgaa gacttcttta aattggaacg ggacatgaaa 
841 atcaattgct ctgggaaaat tgtaattgcc agatatggga aagttttcag aggaaataag 
901 gttaaaaatg cccagctggc aggggccaaa ggagtcattc tctactccga ccctgctgac 
961 tactttgctc ctggggtgaa gtcctatcca gatggttgga atcttcctgg aggtggtgtc 
1021 cagcgtggaa atatcctaaa tctgaatggt gcaggagacc ctctcacacc aggttaccca 
1081 gcaaatgaat atgcttatag gcgtggaatt gcagaggctg ttggtcttcc aagtattcct 
1141 gttcatccaa ttggatacta tgatgcacag aagctcctag aaaaaatggg tggctcagca 
1201 ccaccagata gcagctggag aggaagtctc aaagtgccct acaatgttgg acctggcttt 
1261 actggaaact tttctacaca aaaagtcaag atgcacatcc actctaccaa tgaagtgaca 
1321 agaatttaca atgtgatagg tactctcaga ggagcagtgg aaccagacag atatgtcatt 
1381 ctgggaggtc accgggactc atgggtgttt ggtggtattg accctcagag tggagcagct 
1441 gttgttcatg aaattgtgag gagctttgga acactgaaaa aggaagggtg gagacctaga 
1501 agaacaattt tgtttgcaag ctgggatgca gaagaatttg gtcttcttgg ttctactgag 
1561 tgggcagagg agaattcaag actccttcaa gagcgtggcg tggcttatat taatgctgac 
1621 tcatctatag aaggaaacta cactctgaga gttgattgta caccgctgat gtacagcttg 
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1681 gtacacaacc taacaaaaga gctgaaaagc cctgatgaag gctttgaagg caaatctctt 
1741 tatgaaagtt ggactaaaaa aagtccttcc ccagagttca gtggcatgcc caggataagc 
1801 aaattgggat ctggaaatga ttttgaggtg ttcttccaac gacttggaat tgcttcaggc 
1861 agagcacggt atactaaaaa ttgggaaaca aacaaattca gcggctatcc actgtatcac 
1921 agtgtctatg aaacatatga gttggtggaa aagttttatg atccaatgtt taaatatcac 
1981 ctcactgtgg cccaggttcg aggagggatg gtgtttgagc tagccaattc catagtgctc 
2041 ccttttgatt gtcgagatta tgctgtagtt ttaagaaagt atgctgacaa aatctacagt 
2101 atttctatga aacatccaca ggaaatgaag acatacagtg tatcatttga ttcacttttt 
2161 tctgcagtaa agaattttac agaaattgct tccaagttca gtgagagact ccaggacttt 
2221 gacaaaagca acccaatagt attaagaatg atgaatgatc aactcatgtt tctggaaaga 
2281 gcatttattg atccattagg gttaccagac aggccttttt ataggcatgt catctatgct 
2341 ccaagcagcc acaacaagta tgcaggggag tcattcccag gaatttatga tgctctgttt 
2401 gatattgaaa gcaaagtgga cccttccaag gcctggggag aagtgaagag acagatttat 
2461 gttgcagcct tcacagtgca ggcagctgca gagactttga gtgaagtagc ctaagaggat 
2521 tctttagaga atccgtattg aatttgtgtg gtatgtcact cagaaagaat cgtaatgggt 
2581 atattgataa attttaaaat tggtatattt gaaataaagt tgaatattat atataaaaaa 
2641 aaaaaaaaaa aaa 



NM 003147 Homo sapiens synovial sarcoma, X breakpoint 2 (SSX2), mRNA 

LOCUS NMJ303147 766 bp mRNA PRI 14-MAR-2001 

DEFINITION Homo sapiens synovial sarcoma, X breakpoint 2 (SSX2 ) , mRNA. 
ACCESSION NM_00314 7 

VERSION NM_003147.1 GI: 10337582 

SEQ ID NO. 40 

/translation="MNGDDAFARRPTVGAQIPEKIQKAFDDIAKYFSKEEWEKMKASE 
KIFYVYMKRKYEAMTKLGFKATLPPFMCNKRAEDFQGNDLDNDPNRGNQVERPQMTFG 
RLQGISPKIMPKKPAEEGNDSEEVPEASGPQNDGKELCPPGKPTTSEKIHERSGPKRG 
EHAWTHRLRERKQLVIYEEISDPEEDDE" 



SEQ ID NO 41 

1 ctctctttcg 

61 ccaaaatcag 

121 acccacggtt 

181 atacttctct 

241 tatgaagaga 

301 catgtgtaat 

361 tgggaatcag 

421 gatcatgccc 

481 tggcccacaa 



attcttccat 
agtcagactg 
ggtgctcaaa 
aaggaagagt 
aagtatgagg 
aaacgggccg 
gttgaacgtc 
aagaagccag 
aatgatggga 



actcagagta 
ctcccggtgc 
taccagagaa 
gggaaaagat 
ctatgactaa 
aagacttcca 
ctcagatgac 
cagaggaagg 
aagagctgtg 



cgcacggtct 
catgaacgga 
gatccaaaag 
gaaagcctcg 
actaggtttc 
ggggaatgat 
tttcggcagg 
aaatgattcg 
ccccccggga 



gattttctct 
gacgacgcct 
gccttcgatg 
gagaaaatct 
aaggccaccc 
ttggataatg 
ctccagggaa 
gaggaagtgc 
aaaccaacta 



ttggattctt 
ttgcaaggag 
atattgccaa 
tctatgtgta 
tcccaccttt 
accctaaccg 
tctccccgaa 
cagaagcatc 
cctctgagaa 
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541 gattcacgag agatctggac ccaaaagggg 

601 gagaaaacag ctggtgattt atgaagagat 

661 cccctcaggg atacgacaca tgcccatgat 

721 catgggcatg gctgcggacc cctcgtcatc 



ggaacatgcc tggacccaca gactgcgtga 
cagcgaccct gaggaagatg acgagtaact 
gagaagcaga acgtggtgac ctttcacgaa 
aggtgcatag caagtg 
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